u* Threshold estimation

Problem

Identifying conditons with insufficient turbulence (indicated by low friction velocity u*), and marking those conditions as data gaps is important to avoid biases in fluxes measured by eddy covariance.

During stable stratification and low turbulent mixing the eddy covariance method faces several problems that introduce bias and uncertainties. These problems primarily happen during night and lead to an underestimation of the night-time flux, i.e. the ecosystem respiration. These problems can be detected via a micro-meteorological quality control that tests if the assumptions of the eddy covariance method are not too strongly violated for a particular half hour (e.g. Foken and Wichura, 1996; weblink). Under circumstances where the necessary information for those tests is not available, a heuristic class of methods is widely accepted that assumes that a treshhold of friction velocity (u*) can be site and season specifically established above that night-time fluxes are considered valid. This threshold is usually established by relating the night-time flux to friction velocity while accounting for temperature as a covariate (u*-filtering).

Description

Here the minimum friction velocity u* is estimsted according to method described in Papaple et al. (2006). Alternatively, a Change Point Detection can be applied to detect saturation of NEE with increasing u*, similar to the method described in Barr et al. (2013).

With both methods work on data where solar radiation is below a threshold (default: Rg < 10 Wm2). They assume that photosynthesis is near zero. The data is subset to similar environmental conditions, aside from friction velocity: adjacent times (seasons) and temperature classes (default 7 classes). Within each season/temperature subclass the u* threshold is estimated at which NEE saturates. Next the u* threshold estimates of those subclasses are aggregated to a an u* threshold for each year.

Season

Within one year: Papale (2006) applied the estimation procedure to annual subsets of the data. Therefore, seasons could not span across several years. The default applied with the online tool starts seasons beginning in March, June, September, and December. Data of December are treated in the same season as January and February of the same year.
Continuous: The REddyProc packages is designed to avoid breaks at year boundaries. Hence it uses continuous seaons by default. Default seasons also start beginning in March, June, September, and December. However, December is treated in the same seasons as January and February of the next year. The annual u* thresholds are then applied according those continuous seasons spanning year boundaries. For example the estimate for 2014 is applied for winter 2014 (starting in December 2013) to Autum 2014 (ending in November 2014).

User specified: Seasons represent periods of similar micrometeorological conditions, such as friction. In order to allow the possibility of site specific specification of changes, e.g. surface roughness with management, the seasons are compiled from column "seasons" provided with the the input data. The column must list the same value for each record belonging to one season. Each season is associated with the year of the day at middle between first and last day associated with the season.

With all methods, there is a minimum number of records in a season. If there are too few records, the data of the seasons within one year are combined, before estimating the u* threshold.

Moving Point test versus Change Point detection

Papale et al. (2006) estimated u* threshold in each season/temperature subclass by binning the records to similar u* and computing the mean NEE amd mean u* for each class (default 20 classes). For each bin, a moving point test was applied to each bin, to determine the threshold. With the forward2-method, applied with the BGI online tool, for each bin is checked if the bin-NEE is higher than 0.95 times the mean of the following 10 bin-NEE values. If this holds true also for the next bin, the mean u* of the bin is reported as threshold.

There are slight differences in the Palale et al. (2006) and the BGI binning scheme: Both methods bin, so that the number of records in each bin is the same. If there are numerically equal u* values, they are sorted into the same bin, resulting in bins with more records with both methods. With the C-implementation by Papale et al., less and sometimes no records are sorted into the following bins. Contrary, the binninng with the REddyproc packages used by the BGI online tool ensures that there are a minimum of records in all bins. This often results in fewer bins than without numerically equal u* values. Moreover, differing from the Papale et al. C-implementation, the REddyproc package does not report a threshold in those data subsets where the NEE plateau (after the u* threshold bin) consists of less than three points, i.e. bins.

Barr et al. (2013) instead used Change Point Detection to infer the bin with the highest probability of being a change point in the u*-NEE relationship. REddyProc implements the similar RTw method. It estimates the breakpoint within the seasons/temperature classes based on the unbinned raw values using the segmented package. Hence, it avoids the sometimes very sensitive binning of u* values. Note also, that REddyProc differs from Barr (2013) by using the Papale 2006 aggregation scheme (described next) also with the Change Point Detection.

Aggregation to annual u* threshold

The online tool uses the same aggregation scheme as Papale et al. (2006). Within one season, the median is taken across the thresholds of different temperature classes. Within one year, the maximum is taken across the associated seasons.

Differeing from the Papale et al. C-implementation, the REddyproc package does not report a seasonal estimate if a threshold was found in less than 20% of the temperature subsets within the season.

Bootstrap to estimate the uncertainty of the threshold

Estimates of the u* threshold are often sensitive to the specifics of the methods, e.g. the binning, minimum number of records, criteria in aggregation, etc. Therefore, bootstrap (resampling with replacement) is applied to generate 200 pseudoreplicates of the dataset and on for each replicate the threshold is estimated. Quantiles (default 5%, 50% and 95%) of the estimates are reported as a range of threshold estimates. The gapfilling and partitioning is then applied using those different thresholds to propagate the uncertainty of u* threshold estimation to NEE, GPP and Reco.

In difference to Papale et al. C-implementation, the REddyProc package resamples the data only within seasons instead across the entire dataset.

Comparison of the implementations

The REddyProc implementation has been benchmarked against the Papale et al. C-implementation.

Main conclusions are:

There is no systematic difference between u* theshold estimates and inferred annual NEE of the Papale et al. C-implementation, and the Moving Point Test of the REddyProc-implementation across sites.
The difference in bootstrapping method results in estimating only half the uncertainty with the REddyProc package.
The ChangePointDetection method often gives higher u* threshold estimates than then Moving Point Test

Reference

Papale D, Reichstein M, Aubinet M, Canfora E, Bernhofer C, Kutsch W, Longdoz B, Rambal S, Valentini R, Vesala T & et al. (2006) Towards a standardized processing of Net Ecosystem Exchange measured with eddy covariance technique: algorithms and uncertainty estimation. Biogeosciences, Copernicus GmbH, 3, 571 - 583