8 Developer guide

8.1 Updating PEcAn Code and Bety Database

Release notes for all releases can be found here.

This page will only list any steps you have to do to upgrade an existing system. When updating PEcAn it is highly encouraged to update BETY. You can find instructions on how to do this, as well on how to update the database in the Updating BETYdb gitbook page.

8.1.1 Updating PEcAn

The latest version of PEcAn code can be obtained from the PEcAn repository on GitHub:

The PEcAn build system is based on GNU Make. The simplest way to install is to run make from inside the PEcAn directory. This will update the documentation for all packages and install them, as well as all required dependencies.

For more control, the following make commands are available:

  • make document – Use devtools::document to update the documentation for all package. Under the hood, this uses the roxygen2 documentation system.

  • make install – Install all packages and their dependnencies using devtools::install. By default, this only installs packages that have had their code changed and any dependent packages.

  • make check – Perform a rigorous check of packages using devtools::check

  • make test – Run all unit tests (based on testthat package) for all packages, using devtools::test

  • make clean – Remove the make build cache, which is used to track which packages have changed. Cache files are stored in the .doc, .install, .check, and .test subdirectories in the PEcAn main directory. Running make clean will force the next invocation of make commands to operate on all PEcAn packages, regardless of changes.

The following are some additional make tricks that may be useful:

  • Install, check, document, or test a specific package – make .<cmd>/<pkg-dir>; e.g. make .install/utils or make .check/modules/rtm

  • Force make to run, even if package has not changed – make -B <command>

  • Run make commands in parallel – make -j<ncores>; e.g. make -j4 install to install packages using four parallel processes.

All instructions for the make build system are contained in the Makefile in the PEcAn root directory. For full documentation on make, see the man pages by running man make from a terminal.

8.2 Git and GitHub Workflow

Using Git

8.2.1 Using Git

This document describes the steps required to download PEcAn, make changes to code, and submit your changes.

8.2.1.1 Git

Git is a free & open source, distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Every Git clone is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server. Branching and merging are fast and easy to do.

A good place to start is the GitHub 5 minute illustrated tutorial. In addition, there are three fun tutorials for learning git:

URLs In the rest of the document will use specific URL’s to clone the code. There a few URL’s you can use to clone a project, using https, ssh and git. You can use either https or git to clone a repository and write to it. The git protocol is read-only. This document describes the steps required to download PEcAn, make changes to code, and submit your changes.

8.2.1.2 PEcAn Project and Github

These instructions apply to other repositories too.

8.2.1.3 PEcAn Project Branches

We follow branch organization laid out on this page.

In short, there are three main branches you must be aware of:

  • develop - Main Branch containing the latest code. This is the main branch you will make changes to.
  • master - Branch containing the latest stable code. DO NOT MAKE CHANGES TO THIS BRANCH.
  • release/vX.X.X - Named branches containing code specific to a release. Only make changes to this branch if you are fixing a bug on a release branch.

8.2.1.4 Milestones, Issues, Tasks

The Milestones, issues, and tasks can be used to organize specific features or research projects. In general, there is a heirarchy:

  • milestones (Big picture, “Epic”): contains many issues, organized by release.
  • issues (Specific features / bugs, “Story”): may contain a list of tasks; represent
  • task list (to do list, “Tasks”): list of steps required to close an issue, e.g.:
* [ ] first do this
* [ ] then this
* [ ] completed when x and y

8.2.1.5 Quick and Easy

The easiest approach is to use GitHub’s browser based workflow. This is useful when your change is a few lines, if you are editing a wiki, or if the edit is trivial (and won’t break the code). The GitHub documentation is here but it is simple: finding the page or file you want to edit, click “edit” and then the GitHub web application will automatically forking and branch, then allow you to submit a pull request. However, it should be noted that unless you are a member of the PEcAn project that the “edit” button will not be active and you’ll want to follow the workflow described below for forking and then submitting a pull request.

8.2.1.7 Before any work is done

The first step below only needs to be done once when you first start working on the PEcAn code. The steps below that need to be done to set up PEcAn on your computer, and would need to be repeated if you move to a new computer. If you are working from the PEcAn VM, you can skip the “git clone” since the PEcAn code is already installed.

Most people will not be able to work in the PEcAn repository directly and will need to create a fork of the PEcAn source code in their own folder. To fork PEcAn into your own github space (github help: “fork a repo”). This forked repository will allow you to create branches and commit changes back to GitHub and create pull requests to the develop branch of PEcAn.

The forked repository is the only way for external people to commit code back to PEcAn and BETY. The pull request will start a review process that will eventually result in the code being merged into the main copy of the codebase. See https://help.github.com/articles/fork-a-repo for more information, especially on how to keep your fork up to date with respect to the original. (Rstudio users should also see Git + Rstudio, below)

You can setup SSH keys to make it easier to commit cod back to GitHub. This might especially be true if you are working from a cluster, see set up ssh keys

  1. Introduce yourself to GIT

git config --global user.name "FULLNAME" git config --global user.email you@yourdomain.example.com

  1. Fork PEcAn on GitHub. Go to the PEcAn source code and click on the Fork button in the upper right. This will create a copy of PEcAn in your personal space.

  2. Clone to your local machine via command line

git clone git@github.com:<username>/pecan.git

If this does not work, try the https method

git clone https://github.com/PecanProject/pecan.git

  1. Define upstream repository
cd pecan
git remote add upstream git@github.com:PecanProject/pecan.git

8.2.1.8 During development:

  • commit often;
  • each commit can address 0 or 1 issue; many commits can reference an issue
  • ensure that all tests are passing before anything is pushed into develop.

8.2.1.9 Basic Workflow

This workflow is for educational purposes only. Please use the Recommended Workflow if you plan on contributing to PEcAn. This workflow does not include creating branches, a feature we would like you to use. 1. Get the latest code from the main repository

git pull upstream develop

  1. Do some coding

  2. Commit after each chunk of code (multiple times a day)

git commit -m "<some descriptive information about what was done; references/fixes gh-X>"

  1. Push to YOUR Github (when a feature is working, a set of bugs are fixed, or you need to share progress with others)

git push origin develop

  1. Before submitting code back to the main repository, make sure that code compiles from the main directory.

make

  1. submit pull request with a reference to related issue;

8.2.1.11 After pull request is merged

  1. Make sure you start in master

git checkout develop

  1. delete branch remotely

git push origin --delete <branchname>

  1. delete branch locally

git branch -D <branchname>

8.2.1.12 Fixing a release Branch

If you would like to make changes to a release branch, you must follow a different workflow, as the release branch will not contain the latest code on develop and must remain seperate.

  1. Fetch upstream remote branches

git fetch upstream

  1. Checkout the correct release branch

git checkout -b release/vX.Y.Z

  1. Compile Code with make

make

  1. Make changes and commit them

git add <changed_file.R> git commit -m "Describe changes"

  1. Compile and make roxygen changes make make document

  2. Commit and push any files that were changed by make document

  3. Make a pull request. It is essential that you compare your pull request to the remote release branch, NOT the develop branch.

8.2.1.14 Other Useful Git Commands:

  • GIT encourages branching “early and often”
  • First pull from develop
  • Branch before working on feature
  • One branch per feature
  • You can switch easily between branches
  • Merge feature into main line when branch done

If during above process you want to work on something else, commit all your code, create a new branch, and work on new branch.

  • Delete a branch: git branch -d <name of branch>
  • To push a branch git: push -u origin`
  • To check out a branch:
git fetch origin
git checkout --track origin/<name of branch>
  • Show graph of commits:

git log --graph --oneline --all

8.2.1.15 Tags

Git supports two types of tags: lightweight and annotated. For more information see the Tagging Chapter in the Git documentation.

Lightweight tags are useful, but here we discuss the annotated tags that are used for marking stable versions, major releases, and versions associated with published results.

The basic command is git tag. The -a flag means ‘annotated’ and -m is used before a message. Here is an example:

git tag -a v0.6 -m "stable version with foo and bar features, used in the foobar publication by Bob"

Adding a tag to the a remote repository must be done explicitly with a push, e.g.

git push v0.6

To use a tagged version, just checkout:

git checkout v0.6

To tag an earlier commit, just append the commit SHA to the command, e.g.

git tag -a v0.99 -m "last version before 1.0" 9fceb02

Using GitHub The easiest way to get working with GitHub is by installing the GitHub client. For instructions for your specific OS and download of the GitHub client, see https://help.github.com/articles/set-up-git. This will help you set up an SSH key to push code back to GitHub. To check out a project you do not need to have an ssh key and you can use the https or git url to check out the code.

8.2.1.16 Git + Rstudio

Rstudio is nicely integrated with many development tools, including git and GitHub. It is quite easy to check out source code from within the Rstudio program or browser. The Rstudio documentation includes useful overviews of version control and R package development.

Once you have git installed on your computer (see the Rstudio version control documentation for instructions), you can use the following steps to install the PEcAn source code in Rstudio.

8.2.1.17 Creating a Read-only version:

This is a fast way to clone the repository that does not support contributing new changes (this can be done with further modification).

  1. install Rstudio (www.rstudio.com)
  2. click (upper right) project

8.2.1.18 For development:

  1. create account on github
  2. create a fork of the PEcAn repository to your own account https://www.github.com/pecanproject/pecan
  3. install Rstudio (www.rstudio.com)
  4. generate an ssh key
  • in Rstudio:
    • Tools -> Options -> Git/SVN -> "create RSA key"
  • View public key -> ctrl+C to copy
  • in GitHub
  • go to ssh settings
  • -> 'add ssh key' -> ctrl+V to paste -> 'add key'
  1. Create project in Rstudio
  • project (upper right) -> create project -> version control -> Git - clone a project from a Git Repository
  • paste repository url git@github.com:<username>/pecan.git>
  • choose working dir. for repository

8.2.1.19 References

8.2.1.20 Git Documentation

8.2.1.21 GitHub Documentation

When in doubt, the first step is to click the “Help” button at the top of the page.

8.2.2 GitHub use with PEcAn

In this section, development topics are introduced and discussed. PEcAn code lives within the If you are looking for an issue to work on, take a look through issues labled “good first issue”. To get started you will want to review

We use GitHub to track development.

To learn about GitHub, it is worth taking some time to read through the FAQ. When in doubt, the first step is to click the “Help” button at the top of the page.

  • To address specific people, use a github feature called @mentions e.g. write @dlebauer, @robkooper, @mdietze, or @serbinsh … in the issue to alert the user as described in the GitHub documentation on notifications

8.2.2.1 Bugs, Issues, Features, etc.

8.2.2.2 Reporting a bug

  1. (For developers) work through debugging.
  2. Once you have identified a problem, that you can not resolve, you can write a bug report
  3. Write a bug report
  4. submit the bug report
  5. If you do find the answer, explain the resolution (in the issue) and close the issue

8.2.2.3 Required content

Note:

  • a bug is only a bug if it is reproducible
  • clear bug reports save time
  1. Clear, specific title
  2. Description -
  • What you did
  • What you expected to happen
  • What actually happened
  • What does work, under what conditions does it fail?
  • Reproduction steps - minimum steps required to reproduce the bug
  1. additional materials that could help identify the cause:
  • screen shots
  • stack traces, logs, scripts, output
  • specific code and data / settings / configuration files required to reproduce the bug
  • environment (operating system, browser, hardware)

8.2.2.4 Requesting a feature

(from The Pragmatic Programmer, available as ebook through UI libraries, hardcopy on David’s bookshelf)

  • focus on “user stories”, e.g. specific use cases
  • Be as specific as possible,

  • Here is an example:

  1. Bob is at www.mysite.edu/maps
  2. map of the the region (based on user location, e.g. US, Asia, etc)
  3. option to “use current location” is provided, if clicked, map zooms in to, e.g. state or county level
  4. for site run:
    1. option to select existing site or specify point by lat/lon
    2. option to specify a bounding box and grid resolution in either lat/lon or polar stereographic.
  5. asked to specify start and end times in terms of year, month, day, hour, minute. Time is recorded in UTC not local time, this should be indicated.

8.2.2.5 Closing an issue

  1. Definition of “Done”
  • test
  • documentation
  1. when issue is resolved:
  • status is changed to “resolved”
  • assignee is changed to original author
  1. if original author agrees that issue has been resolved
  • original author changes status to “closed”
  1. except for trivial issues, issues are only closed by the author

8.2.2.6 When to submit an issue?

Ideally, non-trivial code changes will be linked to an issue and a commit.

This requires creating issues for each task, making small commits, and referencing the issue within your commit message. Issues can be created on GitHub. These issues can be linked to commits by adding text such as fixes gh-5).

Rationale: This workflow is a small upfront investment that reduces error and time spent re-creating and debugging errors. Associating issues and commits, makes it easier to identify why a change was made, and potential bugs that could arise when the code is changed. In addition, knowing which issue you are working on clarifies the scope and objectives of your current task.

8.3 Coding Practices

8.3.1 Coding Style

Consistent coding style improves readability and reduces errors in shared code.

R does not have an official style guide, but Hadley Wickham provides one that is well thought out and widely adopted. Advanced R: Coding Style.

Both the Wickham text and this page are derived from Google’s R Style Guide.

8.3.1.1 Use Roxygen2 documentation

This is the standard method of documentation used in PEcAn development, it provides inline documentation similar to doxygen. Even trivial functions should be documented.

See Roxygen2.

8.3.1.2 Write your name at the top

Any function that you create or make a meaningful contribution to should have your name listed after the author tag in the function documentation.

8.3.1.3 Use testthat testing package

See Unit_Testing for instructions, and Advanced R: Tests.

  • tests provide support for documentation - they define what a function is (and is not) expected to do
  • all functions need tests to ensure basic functionality is maintained during development.
  • all bugs should have a test that reproduces the bug, and the test should pass before bug is closed

8.3.1.4 Don’t use shortcuts

R provides many shortcuts that are useful when coding interactively, or for writing scripts. However, these can make code more difficult to read and can cause problems when written into packages.

8.3.1.5 Function Names (verb.noun)

Following convention established in PEcAn 0.1, we use the all lowercase with periods to separate words. They should generally have a verb.noun format, such as query.traits, get.samples, etc.

8.3.1.6 File Names

File names should end in .R, .Rdata, or .rds (as appropriate) and should be meaningful, e.g. named after the primary functions that they contain. There should be a separate file for each major high-level function to aid in identifying the contents of files in a directory.

8.3.1.7 Use “<-” as an assignment operator

Because most R code uses <- (except where = is required), we will use <- = is reserved for function arguments

8.3.1.8 Use Spaces

  • around all binary operators (=, +, -, <-, etc.).
  • after but not before a comma

8.3.1.9 Use curly braces

The option to omit curly braces is another shortcut that makes code easier to write but harder to read and more prone to error.

8.3.1.10 Package Dependencies

In the source code for PEcAn functions, all functions that are not from base R or the current package must be called with explicit namespacing; i.e. package::function (e.g. ncdf4::nc_open(...), dplyr::select(), PEcAn.logger::logger.warn()). This is intended to maximize clarity for current and future developers (including yourself), and to make it easier to quickly identify (and possibly remove) external dependencies.

In addition, it may be a good idea to call some base R functions with known, common namespace conflicts this way as well. For instance, if you want to use base R’s filter function, it’s a good idea to write it as stats::filter to avoid unintentional conflicts with dplyr::filter.

The one exception to this rule is infix operators (e.g. magrittr::"%>%") which cannot be conveniently namespaced. These functions should be imported using the Roxygen @importFrom tag. For example:

Never use library or require inside package functions.

Any package dependencies added in this way should be added to the Imports: list in the package DESCRIPTION file. Do not use Depends: unless you have a very good reason. The Imports list should be sorted alphabetically, with each package on its own line. It is also a good idea to include version requirements in the Imports list (e.g. dplyr (>=0.7)).

External packages that do not provide essential functionality can be relegated to Suggests instead of Imports. In particular, consider this for packages that are large, difficult to install, and/or bring in a large number of their own dependencies. Functions using these kinds of dependencies should check for their availability with requireNamespace and fail informatively in their absence. For example:

8.3.2 Logging

During development we often add many print statements to check to see how the code is doing, what is happening, what intermediate results there are etc. When done with the development it would be nice to turn this additional code off, but have the ability to quickly turn it back on if we discover a problem. This is where logging comes into play. Logging allows us to use “rules” to say what information should be shown. For example when I am working on the code to create graphs, I do not have to see any debugging information about the SQL command being sent, however trying to figure out what goes wrong during a SQL statement it would be nice to show the SQL statements without adding any additional code.

8.3.2.1 PEcAn logging functions

These logger family of functions are more sophisticated, and can be used in place of stop, warn, print, and similar functions. The logger functions make it easier to print to a system log file.

  • The file test.logger.R provides descriptive examples
  • This query provides an current overview of functions that use logging
  • logger functions (in order of increasing level):
  • logger.debug
  • logger.info
  • logger.warn
  • logger.error
  • the logger.setLevel function sets the level at which a message will be printed
  • logger.setLevel("DEBUG") will print messages from all logger functions
  • logger.setLevel("ERROR") will only print messages from logger.error
  • logger.setLevel("INFO") and logger.setLevel("WARN") shows messages from logger.<level> and higher functions, e.g. logger.setLevel("WARN") shows messages from logger.warn and logger.error
  • logger.setLevel("OFF") suppresses all logger messages
  • To print all messages to console, use logger.setUseConsole(TRUE)

8.3.2.2 Other R logging packages

  • This section is for reference - these functions should not be used in PEcAn, as they are redundant with the logger.* functions described above

R does provide a basic logging capability using stop, warning and message. These allow to print message (and stop execution in case of stop). However there is not an easy method to redirect the logging information to a file, or turn the logging information on and off. This is where one of the following packages comes into play. The packages themselves are very similar since they try to emulate log4j.

Both of the following packages use a hierarchic loggers, meaning that if you change the level of displayed level of logging at one level all levels below it will update their logging.

8.3.2.2.1 logging

The logging development is done at http://logging.r-forge.r-project.org/ and more information is located at http://cran.r-project.org/web/packages/logging/index.html . To install use the following command:

This has my preference pure based on documentation.

8.3.3 Package Data

8.3.3.1 Summary:

Files with the following extensions will be read by R as data:

  • plain R code in .R and .r files are sourced using source()
  • text tables in .tab, .txt, .csv files are read using read() ** objects in R image files: .RData, .rda are loaded using load()
  • capitalization matters
  • all objects in foo.RData are loaded into environment
  • pro: easiset way to store objects in R format
  • con: format is application (R) specific

Details are in ?data, which is mostly a copy of Data section of Writing R Extensions.

8.3.3.2 Accessing data

Data in the [data] directory will be accessed in the following ways,

  • efficient way: (especially for large data sets) using the data function:
  • easy way: by adding the following line to the package DESCRIPTION: note: this should be used with caution or it can cause difficulty as discussed in redmine issue #1118

From the R help page:

Currently, a limited number of data formats can be accessed using the data function by placing one of the following filetypes in a packages’ data directory: * files ending .R or .r are source()d in, with the R working directory changed temporarily to the directory containing the respective file. (data ensures that the utils package is attached, in case it had been run via utils::data.) * files ending .RData or .rda are load()ed. * files ending .tab, .txt or .TXT are read using read.table(..., header = TRUE), and hence result in a data frame. * files ending .csv or .CSV are read using read.table(..., header = TRUE, sep = ';'), and also result in a data frame.

If your data does not fall in those 4 categories, or you can use the system.file function to get access to the data:

The arguments are folder, filename(s) and then package. It will return the fully qualified path name to a file in a package, in this case it points to the trait data. This is almost the same as the data function, however we can now use any function to read the file, such as read.csv instead of read.csv2 which seems to be the default of data. This also allows us to store arbitrary files in the data folder, such as the the bug file and load it when we need it.

8.3.3.2.1 Examples of data in PEcAn packages
  • outputs: [/modules/uncertainties/data/output.RData]
  • parameter samples [/modules/uncertainties/data/samples.RData]

8.3.4 Roxygen2

This is the standard method of documentation used in PEcAn development, it provides inline documentation similar to doxygen.

8.3.4.1 Canonical references:

8.3.4.2 Basic Roxygen2 instructions:

Section headers link to “Writing R extensions” which provides in-depth documentation. This is provided as an overview and quick reference.

8.3.4.3 Tags

  • tags are preceeded by ##'
  • tags required by R: ** title tag is required, along with actual title ** param one for each parameter, should be defined ** return must state what function returns (or nothing, if something occurs as a side effect
  • tags strongly suggested for most functions: ** author ** examples can be similar to test cases.
  • optional tags: ** export required if function is used by another package ** import can import a required function from another package (if package is not loaded or other function is not exported) ** seealso suggests related functions. These can be linked using \code{link{}}

8.3.4.4 Text markup

8.3.4.4.1 Formatting
  • \bold{}
  • \emph{} italics
8.3.4.4.3 Math
  • \eqn{a+b=c} uses LaTex to format an inline equation
  • \deqn{a+b=c} uses LaTex to format displayed equation
  • \deqn{latex}{ascii} and \eqn{latex}{ascii} can be used to provide different versions in latex and ascii.
8.3.4.4.4 Lists
\enumerate{
\item A database consists of one or more records, each with one or
more named fields.
\item Regular lines start with a non-whitespace character.
\item Records are separated by one or more empty lines.
}
\itemize and \enumerate commands may be nested.
8.3.4.4.5 “Tables”:http://cran.r-project.org/doc/manuals/R-exts.html#Lists-and-tables
\tabular{rlll}{
[,1] \tab Ozone \tab numeric \tab Ozone (ppb)\cr
[,2] \tab Solar.R \tab numeric \tab Solar R (lang)\cr
[,3] \tab Wind \tab numeric \tab Wind (mph)\cr
[,4] \tab Temp \tab numeric \tab Temperature (degrees F)\cr
[,5] \tab Month \tab numeric \tab Month (1--12)\cr
[,6] \tab Day \tab numeric \tab Day of month (1--31)
}

8.3.4.5 Example

Here is an example documented function, myfun

##' My function adds three numbers
##'
##' A great function for demonstrating Roxygen documentation
##' @param a numeric
##' @param b numeric
##' @param c numeric
##' @return d, numeric sum of a + b + c
##' @export
##' @author David LeBauer
##' @examples
##' myfun(1,2,3)
##' \dontrun{myfun(NULL)}
myfun <- function(a, b, c){
  d <- a + b + c
  return(d)
}

In emacs, with the cursor inside the function, the keybinding C-x O will generate an outline or update the Roxygen2 documentation.

8.3.4.6 Updating documentation

  • After adding documentation run the following command (replacing common with the name of the folder you want to update): ** In R using devtools to call roxygenize:

8.3.5 Testing

PEcAn uses the testthat package developed by Hadley Wickham. Hadley has written instructions for using this package in his Testing chapter.

8.3.5.1 Rationale

8.3.5.2 Tests makes development easier and less error prone

Testing makes it easier to develop by organizing everything you are already doing anyway - but integrating it into the testing and documentation. With a codebase like PEcAn, it is often difficult to get started. You have to figure out

  • what was I doing yesterday?
  • what do I want to do today?
  • what existing functions do I need to edit?
  • what are the arguments to these functions (and what are examples of valid arguments)
  • what packages are affected
  • where is a logical place to put files used in testing

8.3.5.3 Quick Start:

  • decide what you want to do today
  • identify the issue in github (if none exists, create one)
  • to work on issue 99, create a new branch called “github99” or some descriptive name… Today we will enable an existing function, make.cheas to make goat.cheddar. We will know that we are done by the color and taste.

    git branch goat-cheddar
    git checkout goat-cheddar
  • open existing (or create new) file in inst/tests/. If working on code in “myfunction” or a set of functions in “R/myfile.R”, the file should be named accordingly, e.g. “inst/tests/test.myfile.R”
  • if you are lucky, the function has already been tested and has some examples.
  • if not, you may need to create a minimal example, often requiring a settings file. The default settings file can be obtained in this way:

  • write what you want to do

    test_that("make.cheas can make cheese",{
      goat.cheddar <- make.cheas(source = 'goat', style = 'cheddar')
      expect_equal(color(goat.cheddar), "orange")
      expect_is(object = goat.cheddar, class = "cheese")
      expect_true(all(c("sharp", "creamy") %in% taste(goat.cheddar)))
    }
  • now edit the goat.cheddar function until it makes savory, creamy, orange cheese.
  • commit often
  • update documentation and test

  • commit again
  • when complete, merge, and push

8.3.5.4 Test files

Many of PEcAn’s functions require inputs that are provided as data. These can be in the /data or the /inst/extdata folders of a package. Data that are not package specific should be placed in the PEcAn.all or PEcAn.utils files.

Some useful conventions:

8.3.5.5 Settings

  • A generic settings can be found in the PEcAn.all package
  • database settings can be specified, and tests run only if a connection is available

We currently use the following database to run tests against; tests that require access to a database should check db.exists() and be skipped if it returns FALSE to avoid failed tests on systems that do not have the database installed.

  • instructions for installing this are available on the VM creation wiki
  • examples can be found in the PEcAn.DB package (base/db/tests/testthat/).

  • Model specific settings can go in the model-specific module, for example:

  • test-specific settings:
    • settings text can be specified inline:

      settings.text <- "
        <pecan>
          <nocheck>nope</nocheck> ## allows bypass of checks in the read.settings functions
          <pfts>
            <pft>
              <name>ebifarm.pavi</name>
              <outdir>test/</outdir>
            </pft>
          </pfts>
          <outdir>test/</outdir>
          <database>
            <userid>bety</userid>
            <passwd>bety</passwd>
            <location>localhost</location>
            <name>bety</name>
          </database>
        </pecan>"
      settings <- read.settings(settings.text)
    • values in settings can be updated:

8.3.5.6 Helper functions created to make testing easier

  • tryl returns FALSE if function gives error
  • temp.settings creates temporary settings file
  • test.remote returns TRUE if remote connection is available
  • db.exists returns TRUE if connection to database is available

8.3.5.7 When should I test?

A test should be written for each of the following situations:

  1. Each bug should get a regression test.
  • The first step in handling a bug is to write code that reproduces the error
  • This code becomes the test
  • most important when error could re-appear
  • essential when error silently produces invalid results
  1. Every time a (non-trivial) function is created or edited
  • Write tests that indicate how the function should perform
    • example: expect_equal(sum(1,1), 2) indicates that the sum function should take the sum of its arguments
  • Write tests for cases under which the function should throw an error
  • example: expect_error(sum("foo"))
  • better : expect_error(sum("foo"), "invalid 'type' (character)")

8.3.5.8 What types of testing are important to understand?

8.3.5.9 Unit Testing / Test Driven Development

Tests are only as good as the test

  1. write test
  2. write code

8.3.5.10 Regression Testing

When a bug is found,

  1. write a test that finds the bug (the minimum test required to make the test fail)
  2. fix the bug
  3. bug is fixed when test passes

8.3.5.11 How should I test in R? The testthat package.

tests are found in ~/pecan/<packagename>/inst/tests, for example utils/inst/tests/

See attached file and http://r-pkgs.had.co.nz/tests.html for details on how to use the testthat package.

8.3.5.11.1 List of Expectations
Full Abbreviation
expect_that(x, is_true()) expect_true(x)
expect_that(x, is_false()) expect_false(x)
expect_that(x, is_a(y)) expect_is(x, y)
expect_that(x, equals(y)) expect_equal(x, y)
expect_that(x, is_equivalent_to(y)) expect_equivalent(x, y)
expect_that(x, is_identical_to(y)) expect_identical(x, y)
expect_that(x, matches(y)) expect_matches(x, y)
expect_that(x, prints_text(y)) expect_output(x, y)
expect_that(x, shows_message(y)) expect_message(x, y)
expect_that(x, gives_warning(y)) expect_warning(x, y)
expect_that(x, throws_error(y)) expect_error(x, y)
8.3.5.11.2 How to run tests

add the following to “pecan/tests/testthat.R”

8.3.5.12 basic use of the testthat package

Here is an example of tests (these should be placed in <packagename>/tests/testthat/test-<sourcefilename>.R:

8.3.5.12.2 Function testing

Testing of a new function, as.sequence. The function and documentation are in source:R/utils.R and the tests are in source:tests/test.utils.R.

Recently, I made the function as.sequence to turn any vector into a sequence, with custom handling of NA’s:

The next step was to add documentation and test. Many people find it more efficient to write tests before writing the function. This is true, but it also requires more discipline. I wrote these tests to handle the variety of cases that I had observed.

As currently used, the function is exposed to a fairly restricted set of options - results of downloads from the database and transformations.

8.3.5.13 Testing the Shiny Server

Shiny can be difficult to debug because, when run as a web service, the R output is hidden in system log files that are hard to find and read. One useful approach to debugging is to use port forwarding, as follows.

First, on the remote machine (including the VM), make sure R’s working directory is set to the directory of the Shiny app (e.g., setwd(/path/to/pecan/shiny/WorkflowPlots), or just open the app as an RStudio project). Then, in the R console, run the app as:

shiny::runApp(port = XXXX)
# E.g. shiny::runApp(port = 5638)

Then, on your local machine, open a terminal and run the following command, matching XXXX to the port above and YYYY to any unused port on your local machine (any 4-digit number should work).

ssh -L YYYY:localhost:XXXX <remote connection>
# E.g., for the PEcAn VM, given the above port:
# ssh -L 5639:localhost:5638 carya@localhost -p 6422

Now, in a web browser on your local machine, browse to localhost:YYYY (e.g., localhost:5639) to run whatever app you started with shiny::runApp in the previous step. All of the output should display in the R console where the shiny::runApp command was executed. Note that this includes any print, message, logger.*, etc. statements in your Shiny app.

If the Shiny app hits an R error, the backtrace should include a line like Hit error at of server.R#LXX – that XX being a line number that you can use to track down the error. To return from the error to a normal R prompt, hit <Control>-C (alternatively, the “Stop” button in RStudio). To restart the app, run shiny::runApp(port = XXXX) again (keeping the same port).

Note that Shiny runs any code in the pecan/shiny/<app> directory at the moment the app is launched. So, any changes you make to the code in server.R and ui.R or scripts loaded therein will take effect the next time the app is started.

If for whatever reason this doesn’t work with RStudio, you can always run R from the command line. Also, note that the ability to forward ports (ssh -L) may depend on the ssh configuration of your remote machine. These instructions have been tested on the PEcAn VM (v.1.5.2+).

8.3.6 devtools package

Provides functions to simplify development

Documentation: The R devtools packate

other tips for devtools (from the documentation):

  • Adding the following to your ~/.Rprofile will load devtools when running R in interactive mode:
  • Adding the following to your .Rpackages will allow devtools to recognize package by folder name, rather than directory path

Now, devtools can take pkg as an argument instead of /path/to/pkg/, e.g. so you can use build("pkg") instead of build("/path/to/pkg/")

8.4 Download and Compile PEcAn

Set R_LIBS_USER

CRAN Reference

8.4.1 Download, compile and install PEcAn from GitHub

For more information on the capabilities of the PEcAn Makefile, check out our section on Updating PEcAn.

Following will run a small script to setup some hooks to prevent people from using the pecan demo user account to check in any code.

8.4.2 PEcAn Testrun

Do the run, this assumes you have installed the BETY database, sites tar file and SIPNET.

NB: pecan.xml is configured for the virtual machine, you will need to change the field from ‘/home/carya/’ to wherever you installed your ‘sites’, usually $HOME

8.5 Directory structure

8.5.1 Overview of PEcAn repository as of PEcAn 1.5.3

pecan/
 +- base/          # Core functions
    +- all         # Dummy package to load all PEcAn R packages
    +- db          # Modules for querying the database
    +- logger      # Report warnings without killing workflows
    +- qaqc        # Model skill testing and integration testing
    +- remote      # Communicate with and execute models on local and remote hosts
    +- settings    # Functions to read and manipulate PEcAn settings files
    +- utils       # Misc. utility functions
    +- visualization # Advanced PEcAn visualization module
    +- workflow    # functions to coordinate analysis steps
 +- book_source/   # Main documentation and developer's guide
 +- CHANGELOG.md   # Running log of changes in each version of PEcAn
 +- docker/        # Experimental replacement for PEcAn virtual machine
 +- documentation  # index_vm.html, references, other misc.
 +- models/        # Wrappers to run models within PEcAn
    +- ed/         # Wrapper scripts for running ED within PEcAn
    +- sipnet/     # Wrapper scripts for running SIPNET within PEcAn
    +- ...         # Wrapper scripts for running [...] within PEcAn
    +- template/   # Sample wrappers to copy and modify when adding a new model
 +- modules        # Core modules
    +- allometry
    +- data.atmosphere
    +- data.hydrology
    +- data.land
    +- meta.analysis
    +- priors
    +- rtm
    +- uncertainty
    +- ...
 +- scripts        # R and Shell scripts for use with PEcAn
 +- shiny/         # Interactive visualization of model results
 +- tests/         # Settings files for host-specific integration tests
 +- web            # Main PEcAn website files

8.5.2 Generic R package structure:

see the R development wiki for more information on writing code and adding data.

 +- DESCRIPTION    # short description of the PEcAn library
 +- R/             # location of R source code
 +- man/           # Documentation (automatically compiled by Roxygen)
 +- inst/          # files to be installed with package that aren't R functions
    +- extdata/    # misc. data files (in misc. formats)
 +- data/          # data used in testing and examples (saved as *.RData or *.rda files)
 +- NAMESPACE      # declaration of package imports and exports (automatically compiled by Roxygen)
 +- tests/         # PEcAn testing scripts
   +- testthat/    # nearly all tests should use the testthat framework and live here