14 Workflow modules

NOTE: As of PEcAn 1.2.6 – needs to be updated significantly

14.1 Overview

Workflow inputs and outputs (click to open in new page, then zoom). Code used to generate this image is provided in qaqc/vignettes/module_output.Rmd

PEcAn Workflow

14.1.1 Load Settings:

14.1.1.1 read.settings("/home/pecan/pecan.xml")

  • loads settings
  • create directories
  • generates new xml, put in output folder

14.1.2 Query Database:

14.1.2.1 get.trait.data()

Queries the database for both the trait data and prior distributions associated with the PFTs specified in the settings file. The list of variables that are queried is determined by what variables have priors associated with them in the definition of the pft. Likewise, the list of species that are associated with a PFT determines what subset of data is extracted out of all data matching a given variable name.

14.1.3 Meta Analysis:

14.1.3.1 run.meta.analysis()

The meta-analysis code begins by distilling the trait.data to just the values needed for the meta-analysis statistical model, with this being stored in madata.Rdata. This reduced form includes the conversion of all error statistics into precision (1/variance), and the indexing of sites, treatments, and greenhouse. In reality, the core meta-analysis code can be run independent of the trait database as long as input data is correctly formatted into the form shown in madata.

The evaluation of the meta-analysis is done using a Bayesian statistical software package called JAGS that is called by the R code. For each trait, the R code will generate a [trait].model.bug file that is the JAGS code for the meta-analysis itself. This code is generated on the fly, with PEcAn adding or subtracting the site, treatment, and greenhouse terms depending upon the presence of these effects in the data itself.

Meta-analyses are run, and summary plots are produced.

14.1.4 Write Configuration Files

14.1.4.1 write.configs(model)

  • writes out a configuration file for each model run ** writes 500 configuration files for a 500 member ensemble ** for n traits, writes 6 * n + 1 files for running default Sensitivity Analysis (number can be changed in the pecan settings file)

14.1.5 Start Runs:

14.1.5.1 start.runs(model)

This code starts the model runs using a model specific run function named start.runs.model. If the ecosystem model is running on a remote server, this module also takes care of all of the communication with the remote server and its run queue. Each of your subdirectories should now have a [run.id].out file in it. One instance of the model is run for each configuration file generated by the previous write configs module.

14.1.6 Get Model Output

14.1.6.1 get.model.output(model)

This code first uses a model-specific model2netcdf.model function to convert the model output into a standard output format (MsTMIP). Then it extracts the data for requested variables specified in the settings file as settings$ensemble$variable, averages over the time-period specified as start.date and end.date, and stores the output in a file output.Rdata. The output.Rdata file contains two objects, sensitivity.output and ensemble.output, that is the model prediction for the parameter sets specified in sa.samples and ensemble.samples. In order to save bandwidth, if the model output is stored on a remote system PEcAn will perform these operations on the remote host and only return the output.Rdata object.

14.1.7 Ensemble Analysis

14.1.7.1 run.ensemble.analysis()

This module makes some simple graphs of the ensemble output. Open ensemble.analysis.pdf to view the ensemble prediction as both a histogram and a boxplot. ensemble.ts.pdf provides a timeseries plot of the ensemble mean, meadian, and 95% CI

14.1.8 Sensitivity Analysis, Variance Decomposition

14.1.8.1 run.sensitivity.analysis()

This function processes the output of the previous module into sensitivity analysis plots, sensitivityanalysis.pdf, and a variance decomposition plot, variancedecomposition.pdf . In the sensitivity plots you will see the parameter values on the x-axis, the model output on the Y, with the dots being the model evaluations and the line being the spline fit.

The variance decomposition plot is discussed more below. For your reference, the R list object, sensitivity.results, stored in sensitivity.results.Rdata, contains all the components of the variance decomposition table, as well as the the input parameter space and splines from the sensitivity analysis (reminder: the output parameter space from the sensitivity analysis was in outputs.R).

The variance decomposition plot contains three columns, the coefficient of variation (normalized posterior variance), the elasticity (normalized sensitivity), and the partial standard deviation of each model parameter. This graph is sorted by the variable explaining the largest amount of variability in the model output (right hand column). From this graph identify the top-tier parameters that you would target for future constraint.

14.2 Glossary

  • Inputs: data sets that are used, and file paths leading to them
  • Parameters: e.g. info set in settings file
  • Outputs: data sets that are dropped, and the file paths leading to them