22 PEcAn Setup

22.1 Installing the PEcAn VM

We recommend that new users download the PEcAn VM instead of creating from one from scratch.

Additional Setup - AWS - Shiny - Thredds

22.2 Install PEcAn by hand

These instructions are provided to document how to install PEcAn on different Operating Systems.

  • Prerequisites
  • OS specific installations
  • Install BETY
  • Install models
  • Clone and Compile PEcAn source code
  • Test Run

22.2.1 Installation Prerequisites

Please review the following topics before starting PEcAn installation Creating a Virtual Machine

First create virtual machine

# ----------------------------------------------------------------------
# - VM NAME  = PEcAn
# - CPU      = 2
# - MEMORY   = 2GB 
# - DISK     = 100GB
# - HOSTNAME = pecan
# - FULLNAME = PEcAn Demo User
# - USERNAME = xxxxxxx
# - PASSWORD = yyyyyyy
# - PACKAGE  = openssh
# ----------------------------------------------------------------------

To enable tunnels run the following on the host machine:

Make sure machine is up to date.



Install compiler and other packages needed and install the tools.



Install Virtual Box additions for better integration

Finishing up the machine

Add a message to the login:

Finishing up

Script to clean the VM and remove as much as possible history cleanvm.sh

Make sure machine has SSH keys rc.local

Change the resolution of the console

Once all done, stop the virtual machine Installing data for PEcAn

PEcAn assumes some of the data to be installed on the machine. This page will describe how to install this data. Flux Camp

Following will install the data for flux camp (as well as the demo script for PEcAn).

22.2.2 OS Specific Installations Ubuntu

These are specific notes for installing PEcAn on Ubuntu (14.04) and will be referenced from the main installing PEcAn page. You will at least need to install the build environment and Postgres sections. If you want to access the database/PEcAn using a web browser you will need to install Apache. To access the database using the BETY interface, you will need to have Ruby installed.

This document also contains information on how to install the Rstudio server edition as well as any other packages that can be helpful. Install Postgres

Documentation: http://trac.osgeo.org/postgis/wiki/UsersWikiPostGIS21UbuntuPGSQL93Apt

To install the BETYdb database .. ##### Apache Configuration PEcAn Additional packages

HDF5 Tools, netcdf, GDB and emacs CentOS/RedHat

These are specific notes for installing PEcAn on CentOS (7) and will be referenced from the main installing PEcAn page. You will at least need to install the build environment and Postgres sections. If you want to access the database/PEcAn using a web browser you will need to install Apache. To access the database using the BETY interface, you will need to have Ruby installed.

This document also contains information on how to install the Rstudio server edition as well as any other packages that can be helpful. Additional packages


HDF5 Tools, netcdf, GDB and emacs Mac OSX

These are specific notes for installing PEcAn on Mac OSX and will be referenced from the main installing PEcAn page. You will at least need to install the build environment and Postgres sections. If you want to access the database/PEcAn using a web browser you will need to install Apache. To access the database using the BETY interface, you will need to have Ruby installed.

This document also contains information on how to install the Rstudio server edition as well as any other packages that can be helpful. Install Postgres

For those on a Mac I use the following app for postgresql which has postgis already installed (http://postgresapp.com/)

To get postgis run the following commands in psql:

To check your postgis run the following command again in psql: SELECT PostGIS_full_version(); Apache Configuration

Mac does not support pdo/postgresql by default. The easiest way to install is use: http://php-osx.liip.ch/

To enable pecan to run from your webserver. Ruby

The default version of ruby should work. Or use JewelryBox. Rstudio Server

For the mac you can download Rstudio Desktop. CentOS / RHEL

These are specific notes for installing PEcAn on RedHat/CentOS and will be referenced from the main installing PEcAn page. You will at least need to install the build environment and Postgres sections. If you want to access the database/PEcAn using a web browser you will need to install Apache. To access the database using the BETY interface, you will need to have Ruby installed.

This document also contains information on how to install the Rstudio server edition as well as any other packages that can be helpful. Apache Configuration Install and configure Rstudio-server

based on Rstudio Server documentation

  • add PATH=$PATH:/usr/sbin:/sbin to /etc/profile
  • download and install server:
  • restart server sudo httpd restart
  • now you should be able to access http://<server>/rstudio Install Postgres

See documentation under the BETYdb Wiki

22.2.3 Installing BETY Install Postgres

See OS-specific instructions for installing Postgres + PostGIS * Ubuntu * OSX * RedHat / CentOS

22.2.4 Install Models

This page contains instructions on how to download and install ecosystem models that have been or are being coupled to PEcAn. These instructions have been tested on the PEcAn unbuntu VM. Commands may vary on other operating systems. Also, some model downloads require permissions before downloading, making them unavailable to the general public. Please contact the PEcAn team if you would like access to a model that is not already installed on the default PEcAn VM. CLM 4.5

The version of CLM installed on PEcAn is the ORNL branch provided by Dan Ricciuto. This version includes Dan’s point-level CLM processing scripts

Download the code (~300M compressed), input data (1.7GB compressed and expands to 14 GB), and a few misc inputs.

Required libraries

Compile and build default inputs CLM Test Run

You will see a new directory in scripts: US-UMB_I1850CLM45CN Enter this directory and run (you shouldn’t have to do this normally, but there is a bug with the python script and doing this ensures all files get to the right place):

Next you are ready to go to the run directory:

Open to edit file: datm.streams.txt.CLM1PT.CLM_USRDAT and check file paths such that all paths start with /home/carya/models/ccsm_inputdata

From this directory, launch the executable that resides in the bld directory:

not sure this was the right location, but wherever the executable is

You should begin to see output files that look like this: US-UMB_I1850CLM45CN.clm2.h0.yyyy-mm.nc (yyyy is year, mm is month) These are netcdf files containing monthly averages of lots of variables.

The lnd_in file in the run directory can be modified to change the output file frequency and variables. ED2 CLM-FATES


sudo apt-get upgrade libnetcdf-dev
sudo apt-get install subversion
sudo apt-get install csh
sudo apt-get install cmake
sudo ln -s /usr/bin/make /usr/bin/gmake
sudo rm /bin/sh
sudo ln -s /bin/bash /bin/sh

wget https://github.com/Unidata/netcdf-fortran/archive/v4.4.4.tar.gz
cd netcdf-4.4.4
sudo make install

you might need to mess around with installing netcdf and netcdf-fortran to get a version FATES likes…

Get code from Github (currently private) and go to cime/scripts directory

git clone git@github.com:NGEET/ed-clm.git
cd ed-clm/cime/scripts/

Within CLM-FATES, to be able to build an executable we need to create a reference run. We’ll also use this reference run to grab defaults from, so we’ll be registering the location of both the reference case (location of executable, scripts, etc) and the reference inputs with the PEcAn database. To begin, copy reference run script from pecan

cp ~/pecan/models/fates/inst/create_1x1_ref_case.sh .

Edit reference case script to set NETCDF_HOME, CROOT (reference run case), DIN_LOC_ROOT (reference run inputs). Also, make sure DIN_LOC_ROOT exists as FATES will not create it itself. Then run the script


Be aware that this script WILL ask you for your password on the NCAR server to download the reference case input data (the guest password may work, haven’t tried this). If it gives an error at the pio stage check the log, but the most likely error is it being unable to find a version of netcdf it likes.

Once FATES is installed, set the whole reference case directory as the Model path (leave filename blank) and set the whole inputs directory as an Input with format clm_defaults. GDAY

Navigate to a directory you would like to store GDAY and run the following:

gday is your executable. LPJ-GUESS

Instructions to download source code

Go to LPJ-GUESS website for instructions to access code. MAESPA

Navigate to a directory you would like store MAESPA and run the following:

maespa.out is your executable. Example input files can be found in the inpufiles directory. Executing measpa.out from within one of the example directories will produce output.

MAESPA developers have also developed a wrapper package called Maeswrap. The usual R package installation method install.packages may present issues with downloading an unpacking a dependency package called rgl. Here are a couple of solutions:

22.2.5 Download and Compile PEcAn


CRAN Reference Download, compile and install PEcAn from GitHub

Following will run a small script to setup some hooks to prevent people from using the pecan demo user account to check in any code.

22.2.6 PEcAn Testrun

Do the run, this assumes you have installed the BETY database, sites tar file and SIPNET.

NB: pecan.xml is configured for the virtual machine, you will need to change the field from ‘/home/carya/’ to wherever you installed your ‘sites’, usually $HOME

22.2.7 PEcAn Customizations PalEON version of PEcAn

Install the following packages: gnome2, xfce4, firefox

sudo apt-get -y install gdm gnome-shell xfce4 firefox

Install Rstudio-desktop (+ libjpeg62 seems to be a dependency)

PROC=$( uname -m | sed -e 's/x86_64/amd64/' -e 's/i686/i386/' )
wget http://download1.rstudio.org/rstudio-0.98.507-${PROC}.deb
sudo apt-get install libjpeg62
sudo dpkg -i rstudio-*.deb
rm rstudio-*.deb

Additional pieces of software

wget http://chrono.qub.ac.uk/blaauw/LinBacon_2.2.zip
unzip LinBacon_2.2.zip
rm LinBacon_2.2.zip

wget http://chrono.qub.ac.uk/blaauw/clam.zip
unzip clam.zip
rm clam.zip

Additional R packages:

cat << EOF | R --vanilla
list.of.packages <- c('R2jags', 'RCurl', 'RJSONIO', 'reshape2', 'plyr', 'fields',
                      'maps', 'maptools', 'ggplot2', 'mvtnorm', 'devtools')
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) {
  print("installing : ")
  install.packages(new.packages, repos="http://cran.rstudio.com/")

install_github("neotoma", "ropensci")

22.3 AWS Setup

22.3.1 Porting VM to AWS

The following are Mike’s rough notes from a first attempt to port the PEcAn VM to the AWS. This was done on a Mac

These notes are based on following the instructions here Convert PEcAn VM

AWS allows upload of files as VMDK but the default PEcAn VM is in OVA format

  1. If you haven’t done so already, download the PEcAn VM

  2. Split the OVA file into OVF and VMDK files

tar xf <ovafile> Set up an account on AWS

After you have an account you need to set up a user and save your access key and secret key

In my case I created a user named ‘carya’

Note: the key that ended up working had to be made at https://console.aws.amazon.com/iam/home#security_credential, not the link above. Install EC2 command line tools

wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip

sudo mkdir /usr/local/ec2

sudo unzip ec2-api-tools.zip -d /usr/local/ec2

If need be, download and install JDK

export JAVA_HOME=$(/usr/libexec/java_home)

export EC2_HOME=/usr/local/ec2/ec2-api-tools-<version>

export PATH=$PATH:$EC2_HOME/bin

Then set your user credentials as environment variables:

export AWS_ACCESS_KEY=xxxxxxxxxxxxxx

export AWS_SECRET_KEY=xxxxxxxxxxxxxxxxxxxxxx

Note: you may want to add all the variables set in the above EXPORT commands above into your .bashrc or equivalent. Create an AWS S3 ‘bucket’ to upload VM to

Go to https://console.aws.amazon.com/s3 and click “Create Bucket”

In my case I named the bucket ‘pecan’ Upload

In the code below, make sure to change the PEcAn version, the name of the bucket, and the name of the region. Make sure that the architecture matches the version of PEcAn you downloaded (i386 for 32 bit, x86_64 for 64 bit).

Also, you may want to choose a considerably larger instance type. The one chosen below is that corresponding to the AWS Free Tier

ec2-import-instance PEcAn32bit_1.2.6-disk1.vmdk --instance-type t2.micro --format VMDK --architecture i386 --platform Linux --bucket pecan --region us-east-1 --owner-akid $AWS_ACCESS_KEY --owner-sak $AWS_SECRET_KEY

Make sure to note the ID of the image since you’ll need it to check the VM status. Once the image is uploaded it will take a while (typically about an hour) for Amazon to convert the image to one it can run. You can check on this progress by running

ec2-describe-conversion-tasks <image.ID> Configuring the VM

On the EC2 management webpage, https://console.aws.amazon.com/ec2, if you select Instances on the left hand side (LHS) you should be able to see your new PEcAn image as an option under Launch Instance.

Before launching, you will want to update the firewall to open up additional ports that PEcAn needs – specifically port 80 for the webpage. Port 22 (ssh/sftp) should be open by default. Under “Security Groups” select “Inbound” then “Edit” and then add “HTTP”.

Select “Elastic IPs” on the LHS, and “Allocate New Address” in order to create a public IP for your VM.

Next, select “Network Interfaces” on the LHS and then under Actions select “Associate Addresses” then choose the Elastic IP you just created.

See also http://docs.aws.amazon.com/AmazonVPC/latest/GettingStartedGuide/GetStarted.html

22.3.2 Set up multiple instances (optional)

For info on setting up multiple instances with load balancing see: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/gs-ec2VPC.html

Select “Load Balancers” on the LHS, click on “Create Load Balancer”, follow Wizard keeping defaults.

To be able to launch multiple VMs: Under “Instances” convert VM to an Image. When done, select Launch, enable multiple instances, and associate with the previous security group. Once running, go back to “Load Balancers” and add the instances to the load balancer. Each instance can be accessed individually by it’s own public IP, but external users should access the system more generally via the Load Balancers DNS. Booting the VM

Return to “Instances” using the menu on the LHS.

To boot the VM select “Actions” then “Instance State” then “Start”. In the future, once you have the VM loaded and configured this last step is the only one you will need to repeat to turn your VM on and off.

The menu provided should specify the Public IP where the VM has launched

22.4 Shiny Setup

Installing and configuring Shiny for PEcAn authors - Alexey Shiklomanov - Rob Kooper

NOTE: Instructions are only tested for CentOS 6.5 and Ubuntu 16.04 NOTE: Pretty much every step here requires root access.

22.4.1 Install the Shiny R package and Shiny server

Follow the instructions on the Shiny download page for the operating system you are using.

22.4.2 Modify the shiny configuration file

The Shiny configuration file is located in /etc/shiny-server/shiny-server.conf. Comment out the entire file and add the following, replacing <username> with your user name and <location> with the URL location you want for your app. This will allow you to run Shiny apps from your web browser at https://your.server.edu/shiny/your-location

run as shiny;
server {
    listen 3838;
    location /<location>/ {
        run as <username>;
        site_dir /path/to/your/shiny/app;
        log_dir /var/log/shiny-server;
        directory_index on;

For example, my configuration on the old test-pecan looks like this.

run as shiny;
server {
    listen 3838;
    location /ashiklom/ {
        run as ashiklom;
        site_dir /home/ashiklom/fs-data/pecan/shiny/;
        log_dir /var/log/shiny-server;
        directory_index on;

…and I can access my Shiny apps at, for instance, https://test-pecan.bu.edu/shiny/ashiklom/workflowPlots.

You can add as many location <loc> { ... } fields as you would like.

run as shiny;
server {
    listen 3838;
    location /ashiklom/ {
    location /bety/ {

If you change the configuration, for example to add a new location, you will need to restart Shiny server. If you are setting up a new instance of Shiny, skip this step and continue with the guide, since there are a few more steps to get Shiny working. If there is an instance of Shiny already running, you can restart it with:

## On CentOS
sudo service shiny-server stop
sudo service shiny-server start

## On Ubuntu
sudo systemctl stop shiny-server.service
sudo systemctl start shiny-server.service

22.4.3 Set the Apache proxy

Create a file with the following name, based on the version of the operating system you are using:

  • Ubuntu 16.04 (pecan1, pecan2, test-pecan) – /etc/apache2/conf-available/shiny.conf
  • CentOS 6.5 (psql-pecan) – /etc/httpd/conf.d/shiny.conf

Into this file, add the following:

ProxyPass           /shiny/ http://localhost:3838/
ProxyPassReverse    /shiny/ http://localhost:3838/
RedirectMatch permanent ^/shiny$ /shiny/ Ubuntu only: Enable the new shiny configuration

sudo a2enconf shiny

This will create a symbolic link to the newly created shiny.conf file inside the /etc/apache2/conf-enabled directory. You can do ls -l /etc/apache2/conf-enabled to confirm that this worked.

22.4.4 Enable and start the shiny server, and restart apache On CentOS

sudo ln -s /opt/shiny-server/config/init.d/redhat/shiny-server /etc/init.d
sudo service shiny-server stop
sudo service shiny-server start
sudo service httpd restart

You can check that Shiny is running with service shiny-server status. On Ubuntu

Enable the Shiny server service. This will make sure Shiny runs automatically on startup.

sudo systemctl enable shiny-server.service

Restart Apache.

sudo apachectl restart

Start the Shiny server.

sudo systemctl start shiny-server.service

If there are problems, you can stop the shiny-server.service with…

sudo systemctl stop shiny-server.service

…and then use start again to restart it.

22.4.5 Troubleshooting

Refer to the log files for shiny (/var/log/shiny-server.log) and httpd (on CentOS, /var/log/httpd/error-log; on Ubuntu, /var/log/apache2/error-log).

22.4.6 Further reading

22.5 Thredds Setup

Installing and configuring Thredds for PEcAn authors - Rob Kooper

NOTE: Instructions are only tested for Ubuntu 16.04 on the VM, if you have instructions for CENTOS/RedHat please update this documentation NOTE: Pretty much every step here requires root access.

22.5.1 Install the Tomcat 8 and Thredds webapp

The Tomcat 8 server can be installed from the default Ubuntu repositories. The thredds webapp will be downloaded and installed from unidata.

22.5.2 Ubuntu

First step is to install Tomcat 8 and configure it. The flag -Dtds.content.root.path should point to the location of where the thredds folder is located. This needs to be writeable by the user for tomcat. -Djava.security.egd is a special flag to use a different random number generator for tomcat. The default would take to long to generate a random number.

apt-get -y install tomcat8 openjdk-8-jdk
echo JAVA_OPTS=\"-Dtds.content.root.path=/home/carya \${JAVA_OPTS}\" >> /etc/default/tomcat8
echo JAVA_OPTS=\"-Djava.security.egd=file:/dev/./urandom \${JAVA_OPTS}\" >> /etc/default/tomcat8
service tomcat8 restart

Next is to install the webapp.

mkdir /home/carya/thredds
chmod 777 /home/carya/thredds

wget -O /var/lib/tomcat8/webapps/thredds.war ftp://ftp.unidata.ucar.edu/pub/thredds/4.6/current/thredds.war

Finally we configure Apache to prox the thredds server

cat > /etc/apache2/conf-available/thredds.conf << EOF
ProxyPass        /thredds/ http://localhost:8080/thredds/
ProxyPassReverse /thredds/ http://localhost:8080/thredds/
RedirectMatch permanent ^/thredds$ /thredds/
a2enmod proxy_http
a2enconf thredds
service apache2 reload

22.5.3 Customize the Thredds server

To customize the thredds server for your installation edit the file in /home/carya/thredds/threddsConfig.xml. For example the following file is included in the VM.

<?xml version="1.0" encoding="UTF-8"?>

  <!-- all options are commented out in standard install - meaning use default values -->
  <!-- see http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html -->

    <abstract>Scientific Data</abstract>
    <keywords>meteorology, atmosphere, climate, ocean, earth science</keywords>
      <name>Rob Kooper</name>
      <logoAltText>PEcAn Project</logoAltText>

  The <catalogRoot> element:
  For catalogs you don't want visible from the /thredds/catalog.xml chain
  of catalogs, you can use catalogRoot elements. Each catalog root config
  catalog is crawled and used in configuring the TDS.


   * Setup for generated HTML pages.
   * NOTE: URLs may be absolute or relative, relative URLs must be relative
   * to the webapp URL, i.e., http://server:port/thredds/.
     * CSS documents used in generated HTML pages.
     * The CSS document given in the "catalogCssUrl" element is used for all pages
     * that are HTML catalog views. The CSS document given in the "standardCssUrl"
     * element is used in all other generated HTML pages.
     * -->

     * The Google Analytics Tracking code you would like to use for the
     * webpages associated with THREDDS. This will not track WMS or DAP
     * requests for data, only browsing the catalog.

    The <TdsUpdateConfig> element controls if and how the TDS checks
    for updates. The default is for the TDS to check for the current
    stable and development release versions, and to log that information
    in the TDS serverStartup.log file as INFO entries.

   The <CORS> element controls Cross-Origin Resource Sharing (CORS).
   CORS is a way to allow a website (such as THREDDS) to open up access
   to resources to web pages and applications running on a different domain.
   One example would be allowing a web-application to use fonts from
   a separate host. For TDS, this can allow a javascript app running on a
   different site to access data on a THREDDS server.
   For more information see: https://en.wikipedia.org/wiki/Cross-origin_resource_sharing
   The elements below represent defaults. Only the <enabled> tag is required
   to enable CORS. The default allowed origin is '*', which allows sharing
   to any domain.

   The <CatalogServices> element:
   - Services on local TDS served catalogs are always on.
   - Services on remote catalogs are set with the allowRemote element
   below. They are off by default (recommended).

  Configuring the CDM (netcdf-java library)
  see http://www.unidata.ucar.edu/software/netcdf-java/reference/RuntimeLoading.html

    <ioServiceProvider class="edu.univ.ny.stuff.FooFiles"/>
    <coordSysBuilder convention="foo" class="test.Foo"/>
    <coordTransBuilder name="atmos_ln_sigma_coordinates" type="vertical" class="my.stuff.atmosSigmaLog"/>
    <typedDatasetFactory datatype="Point" class="gov.noaa.obscure.file.Flabulate"/>

  CDM uses the DiskCache directory to store temporary files, like uncompressed files.
    <scour>1 hour</scour>
    <maxSize>1 Gb</maxSize>

  Caching open NetcdfFile objects.
  default is to allow 50 - 100 open files, cleanup every 11 minutes
    <scour>11 min</scour>

  The <HTTPFileCache> element:
  allow 10 - 20 open datasets, cleanup every 17 minutes
  used by HTTP Range requests.
    <scour>17 min</scour>

  Writing GRIB indexes.

  Persist joinNew aggregations to named directory. scour every 24 hours, delete stuff older than 90 days
    <scour>24 hours</scour>
    <maxAge>90 days</maxAge>

  How to choose the template dataset for an aggregation. latest, random, or penultimate

  The Netcdf Subset Service is off by default.
    <scour>10 min</scour>
    <maxAge>-1 min</maxAge>

  The WCS Service is off by default.
  Also, off by default (and encouraged) is operating on a remote dataset.
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>



  <!-- CatalogGen service is off by default.

  <!-- DLwriter service is off by default.
       As is support for operating on remote catalogs.

  <!-- DqcService is off by default.

   Link to a Viewer application on the HTML page:

   Add a DataSource - essentially an IOSP with access to Servlet request parameters

   set FeatureCollection logging
       <MaxFileSize>1 MB</MaxFileSize>

    Configure how the NetCDF-4 C library is discovered and used.
    libraryPath: The directory in which the native library is installed.
    libraryName: The name of the native library. This will be used to locate the proper .DLL, .SO, or .DYLIB file
      within the libraryPath directory.
    useForReading: By default, the native library is only used for writing NetCDF-4 files; a pure-Java layer is
      responsible for reading them. However, if this property is set to true, then it will be used for reading
      NetCDF-4 (and HDF5) files as well.

22.5.4 Update the catalog

For example to update the catalog with the latest data, run the following command from the root crontab. This cronjob will also synchronize the database with remote servers and dump your database (by default in /home/carya/dump)

0 * * * * /home/carya/pecan/scripts/cron.sh -o /home/carya/dump

22.5.5 Troubleshooting

Refer to the log files for Tomcat (/var/log/tomcat8/*) and Thredds (/home/carya/thredds/logs).

22.5.6 Further reading