Identify kinetic SOM fractions with incubation data
SoilR, in combination with another optimization package such as FME, can assist you in identifying kinetic soil organic matter fractions through the process of parameter estimation. The procedure consists of: 1) prepare a dataset of observed CO2 fluxes from a soil incubation experiment, 2) setting up a model that can potentially explain the data, 3) create a cost function to optimize, 4) perform the optimization.
For this example, we will use data from an incubation experiment performed by Dr. Susan E. Crow at the University of Hawai’i using a soil from a bare agricultural field. For the optimization we will use R package FME
.
1. Preparation of the dataset
The dataset contains observations of CO2 efflux from soil incubations carried out in 2010 during 191 days. The data is presented as cummulative CO2 and the average of 4 replicates.
CO2flux=data.frame(
time=c(1,2,5,8,13,18,23,28,33,38,48,62,78,93,107,121,135,149,163,177,191),
cummCO2=c(0.140,0.246,0.474,0.690,0.926,1.125,1.316,1.458,1.606,1.755,1.873,
1.967,2.111,2.199,2.284,2.372,2.460,2.545,2.585,2.622,2.653),
Stderr=c(0.054,0.060,0.078,0.100,0.152,0.187,0.204,0.219,0.222,0.223,0.253,
0.305,0.320,0.329,0.362,0.390,0.401,0.413,0.444,0.462,0.488)
)
The initial amount of C at the begining of the incubation is
Ctotal=20.0 #C concentration in soil sample (mg C g-1 soil)
A plot of the data
plot(CO2flux[,1:2],type="p",xlab="Days",
ylab="Cummulative respiration (mg C g-1 soil)", ylim=c(0,4))
arrows(CO2flux[,1],CO2flux[,2]-CO2flux[,3],CO2flux[,1],CO2flux[,2]+
CO2flux[,3],angle=90, length=0.1,code=3)
2. Setting up the model
For this example we will set up a 2-pool model with feedback of the form
$$ \left( \begin{array}{c} \frac{dC_1}{dt} \\ \frac{dC_2}{dt} \end{array} \right) = \left( \begin{array}{cc} -k_1 & \alpha_{1,2} k_2 \\ \alpha_{2,1} k_1 & -k_2 \end{array} \right) \cdot \left( \begin{array}{c} C_1 \\ C_2 \end{array} \right) $$
with initial conditions
$$ \left( \begin{array}{c} C_1(t= 0) = \gamma_1 C_{total} \\ C_2(t=0) = (1-\gamma_1) C_{total} \end{array} \right) $$
For this example, we will identify the decomposition rates \(k\), the transfer coefficients \(\alpha\), and the fractionation coefficient \(\gamma_1\).
We now load the required SoilR
and FME
packages
library(SoilR)
library(FME)
and create a vector of time points to solve the model for these time points
days=seq(0,191) #Incubation days
Lastly, we set up the model using the function TwopFeedbackModel
to produce output that resembles the observed data
eCO2func=function(pars){
mod=TwopFeedbackModel(
t=days,
ks=pars[1:2],
a21=pars[3]*pars[1],
a12=pars[4]*pars[2],
C0=Ctotal*c(pars[5],1-pars[5]),
In=0,
pass=TRUE
)
AccR=getAccumulatedRelease(mod)
return(data.frame(time=days,cummCO2=rowSums(AccR)))
}
3. Create a cost/objective function
We now create a cost function that compares predictions and observations. This function will be used by the optimization algorithm to minimize the difference between predictions and observations. FME function modCost
can create the cost function automatically, and I strongly recommend you to look at the help of this function using the command ?modCost
. For our example we create this cost function as
eCO2cost=function(pars){
modelOutput=eCO2func(pars)
return(modCost(model=modelOutput, obs=CO2flux, err="Stderr"))
}
This cost function takes any set of parameter values, runs the model, and calculates the difference between predictions and observations using the reported standard errors to calculated weighed residuals.
4. Fit the model through optimization
We can now run an optimization of the cost function given an initial set of parameter values
inipars=c(k1=0.5,k2=0.05,alpha21=0.5,alpha12=0.1,gamma=0.5)
This initial set of parameters tells the optimization algorithm where to start the search. In case of poor convergance, you can change these initial values.
To run the optimization, we use the function modFit
from FME as
eCO2fit=modFit(f=eCO2cost,p=inipars,method="Nelder-Mead",
upper=c(Inf,Inf,1,1,1),lower=c(0,0,0,0,0))
Notice that we give lower bounds to all parameters and upper bounds to the transfer and fractionation coefficients. For this example we used the Nelder-Mead optimization algorithm, but you can choose between different methods. Details can be found in the help of the function ?modFit
. The best set of parameter values can be retrieved as
eCO2fit$par
## k1 k2 alpha21 alpha12 gamma
## 0.0644160343 0.0003811812 0.8918784332 0.1027636061 0.7700674412
We can now run the model again with the best parameter set and visually compare the fit
#Run the model again with best parameter set
fitmod=TwopFeedbackModel(t=days, ks=eCO2fit$par[1:2],
a21=eCO2fit$par[3]*eCO2fit$par[1],
a12=eCO2fit$par[4]*eCO2fit$par[2],
C0=Ctotal*c(eCO2fit$par[5],1-eCO2fit$par[5]),
In=0)
fitCum=getAccumulatedRelease(fitmod)
#Plot the results
plot(CO2flux[,1:2],type="p",xlab="Days",
ylab="Cummulative respiration (mg C g-1 soil)", ylim=c(0,4))
arrows(CO2flux[,1],CO2flux[,2]-CO2flux[,3],CO2flux[,1],CO2flux[,2]+
CO2flux[,3],angle=90, length=0.1,code=3)
lines(rowSums(fitCum))
You can also continue the analysis further and perform a Bayesian optimization. For details see the vignette in the FME
package.
Summary
With this procedure we were able to obtain 5 parameter values using only CO2 efflux and total initial carbon data from an incubation experiment. The procedure showed that one kinetic pool cycles at a rate of 0.06 per day, and the other pool cycles much slower at a rate of 0.0004 per day. The transfer coefficients also tell us that about 90% of the C cycled in the fast pool is transferred to the slow pool, and the slow pool feedbacks about 10% of the carbon processed there.
We can also infer that from the initial amount of C at the beginin of the incubation, 77% was stored in the fast cycling pool, while the remaining 33% was stored in the slow cycling pool. This procedure therefore, help us to determine the size of meaningful kinetic pools without resorting to physical or chemical fractionations.