Benchmarking R code of uStar-threshold estimation

Scope

Estimation of the uStar threshold (u*) criterion for sufficiently turbulent conditions is a prerequisite when interpreting Eddy-Covariance data. The standard method is described in the paper of Reichstein 2006.

Here we compare an R-implementation of this method to two C-implementations.

Co-version: original exe-file from Dario Papale's group, compiled there in Nov. 2013.
CTw-version: based on C-source code of Dario Papale's group with three modifications
- Modified uStar binning scheme, that ensures that each class has at least the specified minimum of records
- Quality assurance 1: when comparing the treshold bin to the plateau, it makes sure that the plateau consists of at least 3 bins
- Quality assurance 2: when aggregating the thresholds of different temperature classes to season, it makes sure that at least in 20% of the temperature classes a threshold was found.
- Quality assurance 3: when there are too few records within a year, the records of all seasons associated with this year are aggregated to one big season for threshold estimation.
Rtw-version: re-implementation of the CTw-version with the R programming language with the following modification
- Bootstrap sampling only within seasons

The u* threshold was estimated for 878 site years from 245 sites. The differences between the estimates of the are then compared. Furthermore, the consequences of different u* estimates on annual sums of Net Ecosystem exchange (NEE) were estimated.

In both versions there is an estimate based on the raw data. In order to infer the uncertainty of this estimate, a bootstrap is performed and quantities such as the median bootstrap estimate or confidence intervals.

Results u* estimation

The scatterplot of Fig. 1 shows threshold estimates of the different methods versus the estimate of the CTw method. Each point is an estimate for a year at a site. The regression lines of all the Moving-Point-Test (all except CPT) are all at the 1:1 line (black dashed line). Hence, there is no consistent bias between estimates of these different methods.

The results of the Ctw and the RTw version are numerically equal (differences < 1e-6).

There is no consistent patterns of deviation between all versions.

plot of chunk unnamed-chunk-2

Fig 2 is similar to Fig. 1, unless each point is a bootstrap median of the data at each siteyear.

Due to the different bootstrap methods, there are difference in the CTw and the RTw versions.

At sites, where the C-versions found very high thresholds, the RTw version sometimes provides lower estimates. This is probably because of different roundings and less valid thresholds with the versions.

The scatter of the difference between Co and CTw estimates is decreased compared to the raw estimates. Hence the median threshold estimate is more robust. The regression lines are close to the 1:1 line, aside from differences where the C-version detect very high thresholds.

plot of chunk unnamed-chunk-3

When the differences between the versions are normalized to the bootstrap range at each site, Fig. 3 shows that almost all differences among the methods are within half the 10% to 90% uncertainty range. I.e. there are no significant differences between the methods.

plot of chunk unnamed-chunk-4

The most striking difference between the methods is in the precision of the threshold estimates (Fig. 4). The two C-versions roughly agree (ratio of the ranges is close to one), while the two R-versions estimate a nearly double precision, due to taking care of seasons in bootstrapping.

plot of chunk unnamed-chunk-5

Effects on Gapfilled NEE

The different estimates of u* threshold were used to mark gaps, gap-fill the data, and calculate the cumulative annual Net ecosystem exchange based on the gap-filled time series.

The effects of the methods on u* threshold estimate, were propagated to the cumualtive annual NEE:

The results between CTw and RTw versions were numerically equal.
There was no consistent bias across sites between all the methods.
The absolute differences in NEE between the methods were small (mostly < 20gC/m2/yr)
The differences were mostly within half the 10% to 90% confidence interval
- except the change point detection method sometimes yielded stronger deviations
The two C-versions roughly agree on the precision of the NEE estimate
- the R-implementation (with within season bootstrap) estimate roughly double precision.

plot of chunk unnamed-chunk-6

Discussion

The differences in binning, aggregation of the bins, and quality assurance criteria result in differences in the annual estimate of the u* threshold. Those differences among the methods, however, are within the uncertainty range estimated by a single method and do not introduce significant bias.

The difference in treating seasons during the bootstrap, however, results in large differences in the precision of the threshold estimate. While the C-versions resample the entire year, the R-versions, perform the bootstrap within each season, in order to not mix conditions with different surface roughness conditions. This procedure doubles the estimated precision.

Conclusion

We did not found a consistent under- or overestiamtion of uStar values between all studied versions. All the methods agreed both, on the raw u* estimate and on the median estimate, as the differences among them were well within the uncertainty of a single method.

Taking care of seasonal boundaries during the bootstrap yielded an doubling of precision of the threshold estimates.