25 Maintaining PEcAn
This section will cover topics to keep your PEcAn instance up to date with the latest code and database. It will hopefully answer the question: How to do I keep my PEcAN up to date?
25.1 Updating PEcAn Code and Bety Database
Release notes for all releases can be found here.
This page will only list any steps you have to do to upgrade an existing system. When updating PEcAn it is highly encouraged to update BETY. You can find instructions on how to do this, as well on how to update the database in the Updating BETYdb gitbook page.
25.1.1 Updating PEcAn
The latest version of PEcAn code can be obtained from the PEcAn repository on GitHub:
The PEcAn build system is based on GNU Make.
The simplest way to install is to run make
from inside the PEcAn directory.
This will update the documentation for all packages and install them, as well as all required dependencies.
For more control, the following make
commands are available:
make document
– Usedevtools::document
to update the documentation for all package. Under the hood, this uses theroxygen2
documentation system.make install
– Install all packages and their dependnencies usingdevtools::install
. By default, this only installs packages that have had their code changed and any dependent packages.make check
– Perform a rigorous check of packages usingdevtools::check
make test
– Run all unit tests (based ontestthat
package) for all packages, usingdevtools::test
make clean
– Remove the make build cache, which is used to track which packages have changed. Cache files are stored in the.doc
,.install
,.check
, and.test
subdirectories in the PEcAn main directory. Runningmake clean
will force the next invocation ofmake
commands to operate on all PEcAn packages, regardless of changes.
The following are some additional make
tricks that may be useful:
Install, check, document, or test a specific package –
make .<cmd>/<pkg-dir>
; e.g.make .install/utils
ormake .check/modules/rtm
Force
make
to run, even if package has not changed –make -B <command>
Run
make
commands in parallel –make -j<ncores>
; e.g.make -j4 install
to install packages using four parallel processes.
All instructions for the make
build system are contained in the Makefile
in the PEcAn root directory.
For full documentation on make
, see the man pages by running man make
from a terminal.
Point of contact: Alexey Shiklomanov GitHub/Gitter: @ashiklom email: ashiklom@bu.edu
25.2 Updating BETY
25.3 Database synchronization
The database synchronization consists of 2 parts: - Getting the data from the remote servers to your server - Sharing your data with everybody else
25.3.1 How does it work?
Each server that runs the BETY database will have a unique machine_id and a sequence of ID’s associated. Whenever the user creates a new row in BETY it will receive an ID in the sequence. This allows us to uniquely identify where a row came from. This is information is crucial for the code that works with the synchronization since we can now copy those rows that have an ID in the sequence specified. If you have not asked for a unique ID your ID will be 99.
The synchronization code itself is split into two parts, loading data with the load.bety.sh
script and exporting data using dump.bety.sh
. If you do not plan to share data, you only need to use load.bety.sh
to update your database.
25.3.2 Set up
Requests for new machine ID’s is currently handled manually. To request a machine ID contact Rob Kooper kooper@illinois.edu. In the examples below this ID is referred to as MYSITE.
To setup the database to use this ID you need to call load.bety in ‘CREATE’ mode
sudo -u postgres {$PECAN}/scripts/load.bety.sh -c -u -m X
WARNING: At the momment running CREATE deletes all current records in the database. If you are running from the VM this includes both all runs you have done and all information that the database is prepopulated with (e.g. input and model records). Remote records can be fetched (see below), but local records will be lost (we’re working on improving this!)
25.3.3 Fetch latest data
When logged into the machine you can fetch the latest data using the load.bety.sh script. The script will check what site you want to get the data for and will remove all data in the database associated with that id. It will then reinsert all the data from the remote database.
The script is configured using environment variables. The following variables are recognized:
- DATABASE: the database where the script should write the results. The default is bety
.
- OWNER: the owner of the database (if it is to be created). The default is bety
.
- PG_OPT: additional options to be added to psql (default is nothing).
- MYSITE: the (numerical) ID of your site. If you have not requested an ID, use 99; this is used for all sites that do not want to share their data (i.e. VM). 99 is in fact the default.
- REMOTESITE: the ID of the site you want to fetch the data from. The default is 0 (EBI).
- CREATE: If ‘YES’, this indicates that the existing database (bety
, or the one specified by DATABASE) should be removed. Set to YES (in caps) to remove the database. THIS WILL REMOVE ALL DATA in DATABASE. The default is NO.
- KEEPTMP: indicates whether the downloaded file should be preserved. Set to YES (in caps) to keep downloaded files; the default is NO.
- USERS: determines if default users should be created. Set to YES (in caps) to create default users with default passwords. The default is NO.
All of these variables can be specified as command line arguments, to see the options use -h.
load.bety.sh -h
./scripts/load.bety.sh [-c YES|NO] [-d database] [-h] [-m my siteid] [-o owner] [-p psql options] [-r remote siteid] [-t YES|NO] [-u YES|NO]
-c create database, THIS WILL ERASE THE CURRENT DATABASE, default is NO
-d database, default is bety
-h this help page
-m site id, default is 99 (VM)
-o owner of the database, default is bety
-p additional psql command line options, default is empty
-r remote site id, default is 0 (EBI)
-t keep temp folder, default is NO
-u create carya users, this will create some default users
dump.bety.sh -h
./scripts/dump.bety.sh [-a YES|NO] [-d database] [-h] [-l 0,1,2,3,4] [-m my siteid] [-o folder] [-p psql options] [-u YES|NO]
-a use anonymous user, default is YES
-d database, default is bety
-h this help page
-l level of data that can be dumped, default is 3
-m site id, default is 99 (VM)
-o output folder where dumped data is written, default is dump
-p additional psql command line options, default is -U bety
-u should unchecked data be dumped, default is NO
25.3.5 Automation
Below is an example of a script to synchronize PEcAn database instances across the network.
db.sync.sh
#!/bin/bash
## make sure psql is in PATH
export PATH=/usr/pgsql-9.3/bin/:$PATH
## move to export directory
cd /fs/data3/sync
## Dump Data
MYSITE=1 /home/dietze/pecan/scripts/dump.bety.sh
## Load Data from other sites
MYSITE=1 REMOTESITE=2 /home/dietze/pecan/scripts/load.bety.sh
MYSITE=1 REMOTESITE=5 /home/dietze/pecan/scripts/load.bety.sh
MYSITE=1 REMOTESITE=0 /home/dietze/pecan/scripts/load.bety.sh
## Timestamp sync log
echo $(date +%c) >> /home/dietze/db.sync.log
Typically such a script is set up to run as a cron job. Make sure to schedule this job (crontab -e
) as the user that has database privledges (typically postgres). The example below is a cron table that runs the sync every hour at 12 min after the hour.
MAILTO=user@yourUniversity.edu
12 * * * * /home/dietze/db.sync.sh
25.3.6 Network Status Map
https://pecan2.bu.edu/pecan/status.php
Nodes: red = down, yellow = out-of-date schema, green = good
Edges: red = fail, yellow = out-of-date sync, green = good
25.3.7 Tasks
Following is a list of tasks we plan on working on to improve these scripts: - pecanproject/bety#368 allow site-specific customization of information and UI elements including title, contacts, logo, color scheme.