10 Remote execution with PEcAn
Remote execution allows the user to leverage the power and storage of high performance computing clusters, AWS instances, or specially configured virtual machines, but without leaving their local working environment. PEcAn uses remote execution primarily to run ecosystem models.
The infrastructure for remote execution lives in the
PEcAn.remote package (
base/remote in the PEcAn repository).
This section describes the following:
- Basics of command line SSH
- SSH authentication with keys and passwords
- Basics of SSH tunnels, and how they are used in PEcAn
- Basic remote exectuion R functions in
- Remote model execution configuration in the
- Additional information about preparing remote servers for execution
10.1 Basics of SSH
All of the PEcAn remote infrastructure depends on the system
ssh utility, so it’s important to make sure this works before attempting the advanced remote execution functionality in PEcAn.
To connect to a remote server interactively, the command is simply:
For instance, my connection to the BU shared computing cluster looks like:
…which will prompt me for my BU password, and, if successful, will drop me into a login shell on the remote machine.
Alternatively to the login shell,
ssh can be used to execute arbitrary code, whose output will be returned exactly as it would if you ran the command locally.
For example, the following:
…will run the
pwd command, and return the path to my home directory on the BU SCC.
The more advanced example below will run some simple R code on the BU SCC and return the output as if it was run locally.
10.2 SSH authentication – password vs. SSH key
Because this server uses passwords for authentication, this command will then prompt me for my password.
An alternative to password authentication is using SSH keys.
Under this system, the host machine (say, your laptop, or the PEcAn VM) has to generate a public and private key pair (using the
The private key (by default, a file in
~/.ssh/id_rsa) lives on the host machine, and should never be shared with anyone.
The public key will be distributed to any remote machines to which you want the host to be able to connect.
On each remote machine, the public key should be added to a list of authorized keys located in the
~/.ssh/authorized_keys file (on the remote machine).
The authorized keys list indicates which machines (technically, which keys – a single machine, and even a single user, can have many keys) are allowed to connect to it.
This is the system used by all of the PEcAn servers (
10.2.1 SSH tunneling
SSH authentication can be more advanced than indicated above, especially on systems that require dual authentication. Even simple password-protection can be tricky in scripts, since (by design) it is fairly difficult to get SSH to accept a password from anything other than the raw keyboard input (i.e. SSH doesn’t let you pass passwords as input or arguments, because this exposes your password as plain text).
A convenient and secure way to follow SSH security protocol, but prevent having to go through the full authentication process every time, is to use SSH tunnels (or “sockets”, which are effectively synonymous). Essentially, an SSH socket is a read- and write-protectected file that contains all of the information about an SSH connection.
To create an SSH tunnel, use a command like the following:
If appropriate, this will prompt you for your password (if using password authentication), and then will drop you back to the command line (thanks to the
-N flag, which runs SSH without executing a command, the
-f flag, which pushes SSH into the background, and the
-n flag, which prevents ssh from reading any input).
It will also create the file
To use this socket with another command, use the
-S /path/to/file flag, pointing to the same tunnel file you just created.
This will let you access the server without any sort of authentication step.
As before, if
<optional command> is blank, you will be dropped into an interactive shell on the remote, or if it’s a command, that command will be executed and the output returned.
To close a socket, use the following:
This will delete the socket file and close the connection. Alternatively, a scorched earth approach to closing the SSH tunnel if you don’t remember where you put the socket file is something like the following:
…which will kill all user processes called
To automatically create tunnels following a specific pattern, you can add the following to your
For more information, see
10.2.2 SSH tunnels and PEcAn
Many of the
PEcAn.remote functions assume that a tunnel is already open.
If working from the web interface, the tunnel will be opened for you by some under-the-hood PHP and Bash code, but if debugging or working locally, you will have to create the tunnel yourself.
The best way to do this is to create the tunnel first, outside of R, as described above.
(In the following examples, I’ll use my username
ashiklom connecting to the
test-pecan server with a socket stored in
To follow along, replace these with your own username and designated server, respectively).
Then, in R, create a
host object, which is just a list containing the elements
name (hostname) and
tunnel (path to tunnel file).
This host object can then be used in any of the remote execution functions.
10.3 Basic remote execute functions
PEcAn.remote::remote.execute.cmd function runs a system command on a remote server (or on the local server, if
host$name == "localhost").
x <- PEcAn.remote::remote.execute.cmd(host = my_host, cmd = "echo", args = "Hello world") x
remote.execute.cmd is similar to base R’s
system2, in that the base command (in this case,
echo) is passed separately from its arguments (
Note also that the output of the remote command is returned as a character.
For R code, there is a special wrapper around
PEcAn.remote::remote.execute.R, which runs R code (passed as a string) on a remote and returns the output.
code <- " x <- 2:4 y <- 3:1 x ^ y " out <- PEcAn.remote::remote.execute.R(code = code, host = my_host)
For additional functions related to remote file operations and other stuff, see the
PEcAn.remote package documentation.
10.4 Remote model execution with PEcAn
The workhorse of remote model execution is the
PEcAn.remote::start.model.runs function, which distributes execution of each run in a list of runs (e.g. multiple runs in an ensemble) to the local machine or a remote based on the configuration in the PEcAn settings.
Broadly, there are three major types of model execution:
- Serialized (
PEcAn.remote::start_serial) – This runs models one at a time, directly on the local machine or remote (i.e. same as calling the executables one at a time for each run).
- Via a queue system, (
PEcAn.remote::start_qsub) – This uses a queue management system, such as SGE (e.g.
qstat) found on the BU SCC machines, to submit jobs. For computationally intensive tasks, this is the recommended way to go.
- Via a model launcher script (
PEcAn.remote::setup_modellauncher) – This is a highly customizable approach where task submission is controlled by a user-provided script (
10.4.1 XML configuration
The relevant section of the PEcAn XML file is the
Here is a minimal example from one of my recent runs:
<host> <name>geo.bu.edu</name> <user>ashiklom</user> <tunnel>/home/carya/output//PEcAn_99000000008/tunnel/tunnel</tunnel> </host>
Breaking this down:
name– The hostname of the machine where the runs will be performed. Set it to
localhostto run on the local machine.
user– Your username on the remote machine (note that this may be different from the username on your local machine).
tunnel– This is the tunnel file for the connection used by all remote execution files. The tunnel is created automatically by the web interface, but must be created by the user for command line execution.
This configuration will run in serialized mode.
qsub, the configuration is slightly more involved:
<host> <name>geo.bu.edu</name> <user>ashiklom</user> <qsub>qsub -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash</qsub> <qsub.jobid>Your job ([0-9]+) .*</qsub.jobid> <qstat>qstat -j @JOBID@ || echo DONE</qstat> <tunnel>/home/carya/output//PEcAn_99000000008/tunnel/tunnel</tunnel> </host>
The additional fields are as follows:
qsub– The command used to submit jobs to the queue system. Despite the name, this can be any command used for any queue system. The following variables are available to be set here:
@NAME@– Job name to display
@STDOUT@– File to which
stdoutwill be redirected
@STDERR@– File to which
stderrwill be redirected
qsub.jobid– A regular expression, from which the job ID will be determined. This string will be parsed by R as
jobid <- gsub(qsub.jobid, "\\1", output)– note that the first pattern match is taken as the job ID.
qstat– The command used to check the status of a job. Internally, PEcAn will look for the
DONEstring at the end, so a structure like
<some command indicating if any jobs are still running> || echo DONEis required. The
@JOBID@here is the job ID determined from the
Documentation for using the model launcher is currently unavailable.
10.4.2 Configuration for PEcAn web interface
config.php has a few variables that will control where the web
interface can run jobs, and how to run those jobs. These variables
the near future
$qsuboptions will be
combined into a single list.
$SSHtunnel : points to the script that creates an SSH tunnel.
The script is located in the web folder and the default value of
dirname(__FILE__) . DIRECTORY_SEPARATOR . "sshtunnel.sh"; most
likely will work.
$hostlist : is an array with by default a single value, only
allowing jobs to run on the local server. Adding any other servers
to this list will show them in the pull down menu when selecting
machines, and will trigger the web page to be show to ask for a
username and password for the remote execution (make sure to use
HTTPS setup when asking for password to prevent it from being send
in the clear).
$qsublist : is an array of hosts that require qsub to be used
when running the models. This list can include
$fqdn to indicate
that jobs on the local machine should use qsub to run the models.
$qsuboptions : is an array that lists options for each machine.
Currently it support the following options (see also
array("geo.bu.edu" => array("qsub" => "qsub -V -N @NAME@ -o @STDOUT@ -e @STDERR@ -S /bin/bash", "jobid" => "Your job ([0-9]+) .*", "qstat" => "qstat -j @JOBID@ || echo DONE", "job.sh" => "module load udunits R/R-3.0.0_gnu-4.4.6", "models" => array("ED2" => "module load hdf5"))
In this list
qsub is the actual command line for qsub,
is the text returned from qsub,
qstat is the command to check
to see if the job is finished.
job.sh and the value in models
are additional entries to add to the job.sh file generated to
run the model. This can be used to make sure modules are loaded
on the HPC cluster before running the actual model.
10.5 Running PEcAn code for modules remotely
You can compile and install the model specific code pieces of PEcAn on the cluster easily without having to install the full code base of PEcAn (and all OS dependencies). All of the code pieces depend on PEcAn.utils to install this you can run the following on the cluster:
devtools::install_github("pecanproject/pecan", subdir = 'utils')
Next we need to install the model specific pieces, this is done almost the same:
devtools::install_github("pecanproject/pecan", subdir = 'models/ed')
This should install dependencies required. Following are some notes on how to install the model specifics on different HPC clusters.
Following modules need to be loaded:
module load hdf5 udunits R/R-3.0.0_gnu-4.4.6
Next the following packages need to be installed, otherwise it will fall back on the older versions install site-library
install.packages(c('udunits2', 'lubridate'), configure.args=c(udunits2='--with-udunits2-lib=/project/earth/packages/udunits-2.1.24/lib --with-udunits2-include=/project/earth/packages/udunits-2.1.24/include'), repos='http://cran.us.r-project.org')
Finally to install support for both ED and SIPNET:
devtools::install_github("pecanproject/pecan", subdir = 'utils') devtools::install_github("pecanproject/pecan", subdir = 'models/sipnet') devtools::install_github("pecanproject/pecan", subdir = 'models/ed')