34 Dockerfiles for Models

Each model will be disitrbuted as a docker container. These docker containers will contain the model, as well as some additional code to convert the output from the model to the standard PEcAn output, and code to connect to the message bus and receive messages.

Most of these dockerfiles will be the same and will compile the model in one container, copy the resulting binary and any additional files needed to a second container and add all PEcAn pieces needed. This process will reduce the size of the final image.

Each of these docker images will be ready to run the model given an input file and produce outputs that conform to PEcAn. Each container will contain a small script that will execute the binary, check the model exit code, convert the model output to PEcAn output and quit. The conversion from the model specific output to the PEcAn output is done by calling the model2netcdf. This code will assume the site is located at lat=0, lon=0. To set this you can start the docker process with -e "SITE_LAT=35.9782" -e "SITE_LON=-79.0942". The following variables can be set:

  • SITE_LAT : Latitutude of the site, written in output files, default is 0
  • SITE_LON : Longitude of the site, written in output files, default is 0
  • START_DATE : Start date of the model to be executed.
  • END_DATE : End date of the model to be executed.
  • DELETE_RAW : Should the model output be deleted, default is no.
  • OVERWRITE : Should any results be overwritten

Following environment variables are only for information purpose and should not be changed: - MODEL : The name of the model, used in the script to run the model - BINARY : Location of the acutual binary, used in the script to run the model - OUTDIR : Location where data is, in this case /work - PECAN_VERSION : Version of PEcAn used to compile, default is develop

34.1 SIPNET

The folllowing command will build sipnet v136 for PEcAn using the branch that is currently checked out.

docker build \
    --build-arg MODEL_VERSION=136 \
    --build-arg PECAN_VERSION=$(git rev-parse --abbrev-ref HEAD) \
    --tag pecan/pecan-sipnet:136 \
    --file docker/Dockerfile.sipnet \
    .

Once the process is finished you can push (upload) the created image to docker hub. It will use the tag to place the image. In this case the image will be placed in the pecan project as the pecan-sipnet repository and tagged with 136. To do this you can use:

# do this only once
docker login
# push image
docker push pecan/pecan-sipnet:136

Once the image is pushed to dockerhub anybody can run the model. If you have not already downloaded the docker container the run command will download the image. Next it will run the image and will execute either the default command or the comman line given to the container. In the following example the default command is executed which will run the model and generate the PEcAn output files.

# get test data (only need to do this once)
curl -o sipnet.data.zip http://isda.ncsa.illinois.edu/~kooper/PEcAn/sipnet/sipnet.data.zip
unzip sipnet.data.zip
# cleanup if you rerun
rm -f sipnet.data/{*.nc*,DONE,ERROR,sipnet.out,std*.log}
# run the actual model
docker run -t -i --rm -v ${PWD}/sipnet.data:/work pecan/pecan-sipnet:136

34.1.1 Using the PEcAn download.file() function

download.file(url, destination_file, method)

This custom PEcAn function works together with the base R function download.file (https://stat.ethz.ch/R-manual/R-devel/library/utils/html/download.file.html). However, it provides expanded functionality to generalize the use for a broad range of environments. This is because some computing environments are behind a firewall or proxy, including FTP firewalls. This may require the use of a custom FTP program and/or initial proxy server authentication to retrieve the files needed by PEcAn (e.g. meteorology drivers, other inputs) to run certain model simulations or tools. For example, the Brookhaven National Laboratory (BNL) requires an initial connection to a FTP proxy before downloading files via FTP protocol. As a result, the computers running PEcAn behind the BNL firewall (e.g. https://modex.bnl.gov) use the ncftp cleint (http://www.ncftp.com/) to download files for PEcAn because the base options with R::base download.file() such as curl, libcurl which don’t have the functionality to provide credentials for a proxy or even those such as wget which do but don’t easily allow for connecting through a proxy server before downloading files. The current option for use in these instances is ncftp, specifically ncftpget


Examples:
HTTP

download.file("http://lib.stat.cmu.edu/datasets/csb/ch11b.txt","~/test.download.txt") 

FTP

download.file("ftp://ftp.cdc.noaa.gov/Datasets/NARR/monolevel/pres.sfc.2000.nc", "~/pres.sfc.2000.nc")

customizing to use ncftp when running behind an FTP firewall (requires ncftp to be installed and availible)

download.file("ftp://ftp.cdc.noaa.gov/Datasets/NARR/monolevel/pres.sfc.2000.nc", "~/pres.sfc.2000.nc", method=""ncftpget")


On modex.bnl.gov, the ncftp firewall configuration file (e.g. ~/.ncftp/firewall) is configured as: firewall-type=1 firewall-host=ftpgateway.sec.bnl.local firewall-port=21

which then allows for direct connection through the firewall using a command like:

ncftpget ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf-fortran-4.4.4.tar.gz

To allow the use of ncftpget from within the download.file() function you need to set your R profile download.ftp.method option in your options list. To see your current R options run options() from R cmd, which should look something like this:

> options()
$add.smooth
[1] TRUE

$bitmapType
[1] "cairo"

$browser
[1] "/usr/bin/xdg-open"

$browserNLdisabled
[1] FALSE

$CBoundsCheck
[1] FALSE

$check.bounds
[1] FALSE

$citation.bibtex.max
[1] 1

$continue
[1] "+ "

$contrasts
        unordered           ordered
"contr.treatment"      "contr.poly"

In order to set your download.ftp.method option you need to add a line such as

# set default FTP
options(download.ftp.method = "ncftpget")

In your ~/.Rprofile. On modex at BNL we have set the global option in /usr/lib64/R/etc/Rprofile.site.

Once this is done you should be able to see the option set using this command in R:

> options("download.ftp.method")
$download.ftp.method
[1] "ncftpget"