25.4 PEcAn Docker Architecture

25.4.1 Overview

The PEcAn docker architecture consists of many containers (see figure below) that will communicate with each other. The goal of this architecture is to easily expand the PEcAn system by deploying new model containers and registering them with PEcAn. Once this is done the user can now use these new models in their work. The PEcAn framework will setup the configurations for the models, and send a message to the model containers to start execution. Once the execution is finished the PEcAn framework will continue. This is exactly as if the model is running on a HPC machine. Models can be executed in parallel by launching multiple model containers.

As can be seen in the figure the architecture leverages of two standard containers (in orange). The first container is postgresql with postgis (mdillon/postgis) which is used to store the database used by both BETY and PEcAn. The second containers is a messagebus, more specifically RabbitMQ (rabbitmq).

The BETY app container (pecan/bety) is the front end to the BETY database and is connected to the postgresql container. A http server can be put in front of this container for SSL termination as well to allow for load balancing (by using multiple BETY app containers).

The PEcAn framework containers consist of multiple unique ways to interact with the PEcAn system (none of these containers will have any models installed):

  • PEcAn shiny hosts the shiny applications developed and will interact with the database to get all information necessary to display
  • PEcAn rstudio is a rstudio environment with the PEcAn libraries preloaded. This allows for prototyping of new algorithms that can be used as part of the PEcAn framework later.
  • PEcAn web allows the user to create a new PEcAn workflow. The workflow is stored in the database, and the models are executed by the model containers.
  • PEcAn cli will allow the user to give a pecan.xml file that will be executed by the PEcAn framework. The workflow created from the XML file is stored in the database, and the models are executed by the model containers.

The model containers contain the actual models that are executed as well as small wrappers to make them work in the PEcAn framework. The containers will run the model based on the parameters received from the message bus and convert the outputs back to the standard PEcAn output format. Once the container is finished processing a message it will immediatly get the next message and start processing it.

25.4.2 PEcAn’s docker-compose

The PEcAn Docker architecture is described in full by the PEcAn docker-compose.yml file. For full docker-compose syntax, see the official documentation.

This section describes the top-level structure and each of the services, which are as follows:

For reference, the complete docker-compose file is as follows:

services:
  traefik:
    hostname: traefik
    image: traefik:v2.9
    command:
    - --log.level=INFO
    - --api=true
    - --api.dashboard=true
    - --entrypoints.web.address=:80
    - --providers.docker=true
    - --providers.docker.endpoint=unix:///var/run/docker.sock
    - --providers.docker.exposedbydefault=false
    - --providers.docker.watch=true
    restart: unless-stopped
    networks: pecan
    security_opt: no-new-privileges:true
    ports: ${TRAEFIK_HTTP_PORT-80}:80
    volumes:
    - traefik:/config
    - /var/run/docker.sock:/var/run/docker.sock:ro
    labels:
    - traefik.enable=true
    - traefik.http.routers.traefik.entrypoints=web
    - traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
    - traefik.http.routers.traefik.service=api@internal
  rabbitmq:
    hostname: rabbitmq
    image: rabbitmq:3.8-management
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
    - RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
    labels:
    - traefik.enable=true
    - traefik.http.services.rabbitmq.loadbalancer.server.port=15672
    - traefik.http.routers.rabbitmq.entrypoints=web
    - traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
    volumes: rabbitmq:/var/lib/rabbitmq
    healthcheck:
      test: rabbitmqctl ping
      interval: 10s
      timeout: 5s
      retries: 5
  postgres:
    hostname: postgres
    image: mdillon/postgis:9.5
    restart: unless-stopped
    networks: pecan
    volumes: postgres:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready -U postgres
      interval: 10s
      timeout: 5s
      retries: 5
  bety:
    hostname: bety
    image: pecan/bety:${BETY_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - UNICORN_WORKER_PROCESSES=1
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
    - RAILS_RELATIVE_URL_ROOT=/bety
    - LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
    volumes: bety:/home/bety/log
    depends_on:
      postgres:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.services.bety.loadbalancer.server.port=8000
    - traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
    healthcheck:
      test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT}
        > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  rstudio:
    hostname: rstudio
    image: pecan/base:${PECAN_VERSION:-latest}
    command: /work/rstudio.sh
    restart: unless-stopped
    networks: pecan
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    environment:
    - KEEP_ENV=RABBITMQ_URI RABBITMQ_PREFIX RABBITMQ_PORT FQDN NAME
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - RABBITMQ_PREFIX=/
    - RABBITMQ_PORT=15672
    - FQDN=${PECAN_FQDN:-docker}
    - NAME=${PECAN_NAME:-docker}
    - USER=${PECAN_RSTUDIO_USER:-carya}
    - PASSWORD=${PECAN_RSTUDIO_PASS:-illinois}
    - USERID=${UID:-1001}
    - GROUPID=${GID:-1001}
    volumes:
    - pecan:/data
    - rstudio:/home
    labels:
    - traefik.enable=true
    - traefik.http.routers.rstudio.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/rstudio/`)
    - traefik.http.routers.rstudio.service=rstudio
    - traefik.http.routers.rstudio.middlewares=rstudio-stripprefix,rstudio-headers
    - traefik.http.services.rstudio.loadbalancer.server.port=8787
    - traefik.http.middlewares.rstudio-headers.headers.customrequestheaders.X-RStudio-Root-Path=/rstudio
    - traefik.http.middlewares.rstudio-stripprefix.stripprefix.prefixes=/rstudio
    - traefik.http.routers.rstudio-local.entrypoints=web
    - traefik.http.routers.rstudio-local.rule=Host(`rstudio.pecan.localhost`)
    - traefik.http.routers.rstudio-local.service=rstudio-local
    - traefik.http.services.rstudio-local.loadbalancer.server.port=8787
  docs:
    hostname: docs
    image: pecan/docs:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    labels:
    - traefik.enable=true
    - traefik.http.services.docs.loadbalancer.server.port=80
    - traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
    healthcheck:
      test: curl --silent --fail http://localhost/ > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  pecan:
    hostname: pecan-web
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/web:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - FQDN=${PECAN_FQDN:-docker}
    - NAME=${PECAN_NAME:-docker}
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.services.pecan.loadbalancer.server.port=8080
    - traefik.http.routers.pecan.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/pecan/`)
    volumes:
    - pecan:/data
    - pecan:/var/www/html/pecan/data
    healthcheck:
      test: curl --silent --fail http://localhost:8080/pecan > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  monitor:
    hostname: monitor
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/monitor:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - FQDN=${PECAN_FQDN:-docker}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/monitor/`)
    - traefik.http.routers.monitor.middlewares=monitor-stripprefix
    - traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
    volumes: pecan:/data
    healthcheck:
      test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  executor:
    hostname: executor
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/executor:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - RABBITMQ_PREFIX=/
    - RABBITMQ_PORT=15672
    - FQDN=${PECAN_FQDN:-docker}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  fates:
    hostname: fates
    user: ${UID:-1001}:${GID:-1001}
    image: ghcr.io/noresmhub/ctsm-api:latest
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  basgra:
    hostname: basgra
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-basgra-basgra_n_v1.0:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  sipnet:
    hostname: sipnet-git
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  ed2:
    hostname: ed2-2_2_0
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-ed2-2.2.0:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  maespa:
    hostname: maespa-git
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-maespa-git:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  biocro:
    hostname: biocro-0_95
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-biocro-0.95:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  dbsync:
    hostname: dbsync
    image: pecan/shiny-dbsync:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    depends_on:
      postgres:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.routers.dbsync.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/dbsync/`)
    - traefik.http.routers.dbsync.middlewares=dbsync-stripprefix
    - traefik.http.middlewares.dbsync-stripprefix.stripprefix.prefixes=/monitor
    healthcheck:
      test: curl --silent --fail http://localhost:3838 > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  api:
    hostname: api
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/api:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - PGHOST=${PGHOST:-postgres}
    - HOST_ONLY=${HOST_ONLY:-FALSE}
    - AUTH_REQ=${AUTH_REQ:-FALSE}
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - DATA_DIR=${DATA_DIR:-/data/}
    - DBFILES_DIR=${DBFILES_DIR:-/data/dbfiles/}
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
    labels:
    - traefik.enable=true
    - traefik.http.routers.api.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/api/`)
    - traefik.http.services.api.loadbalancer.server.port=8000
    depends_on:
      postgres:
        condition: service_healthy
    volumes: pecan:/data/
    healthcheck:
      test: curl --silent --fail http://localhost:8000/api/ping > /dev/null || exit
        1
      interval: 10s
      timeout: 5s
      retries: 5
networks:
  pecan: ~
volumes:
  traefik: ~
  postgres: ~
  bety: ~
  rabbitmq: ~
  pecan: ~
  rstudio: ~

There are two ways you can override different values in the docker-compose.yml file. The first method is to create a file called .env that is placed in the same folder as the docker-compose.yml file. This file can override some of configuration variables used by docker-compose. For example the following is an example of the env file

# This file will override the configuration options in the docker-compose
# file. Copy this file to the same folder as docker-compose as .env

# ----------------------------------------------------------------------
# GENERAL CONFIGURATION
# ----------------------------------------------------------------------

# project name (-p flag for docker-compose)
#COMPOSE_PROJECT_NAME=pecan

# ----------------------------------------------------------------------
# TRAEFIK CONFIGURATION
# ----------------------------------------------------------------------

# hostname of server
#TRAEFIK_HOST=pecan-docker.ncsa.illinois.edu

# Run traffik on port 80 (http) and port 443 (https)
#TRAEFIK_HTTP_PORT=80
#TRAEFIK_HTTPS_PORT=443

# Use you real email address here to be notified if cert expires
#TRAEFIK_ACME_EMAIL=pecanproj@gmail.com

# ----------------------------------------------------------------------
# PEcAn CONFIGURATION
# ----------------------------------------------------------------------

# what version of pecan to use
#PECAN_VERSION=develop

# the fully qualified hostname used for this server
#PECAN_FQDN=pecan-docker.ncsa.illinois.edu

# short name shown in the menu
#PECAN_NAME=pecan-docker

# ----------------------------------------------------------------------
# BETY CONFIGURATION
# ----------------------------------------------------------------------

# what version of BETY to use
#BETY_VERSION=develop

# what is our server number, 99=vm, 98=docker
#BETY_LOCAL_SERVER=98

# secret used to encrypt cookies in BETY
#BETY_SECRET_KEY=1208q7493e8wfhdsohfo9ewhrfiouaho908ruq30oiewfdjspadosuf08q345uwrasdy98t7q243

# ----------------------------------------------------------------------
# MINIO CONFIGURATION
# ----------------------------------------------------------------------

# minio username and password
#MINIO_ACCESS_KEY=carya
#MINIO_SECRET_KEY=illinois

# ----------------------------------------------------------------------
# PORTAINER CONFIGURATION
# ----------------------------------------------------------------------

# password for portainer admin account
# use docker run --rm httpd:2.4-alpine htpasswd -nbB admin <password> | cut -d ":" -f 2
#PORTAINER_PASSWORD=$2y$05$5meDPBtS3NNxyGhBpYceVOxmFhiiC3uY5KEy2m0YRbWghhBr2EVn2

# ----------------------------------------------------------------------
# RABBITMQ CONFIGURATION
# ----------------------------------------------------------------------

# RabbitMQ username and password
#RABBITMQ_DEFAULT_USER=carya
#RABBITMQ_DEFAULT_PASS=illinois

# create the correct URI with above username and password
#RABBITMQ_URI=amqp://carya:illinois@rabbitmq/%2F

# ----------------------------------------------------------------------
# RSTUDIO CONFIGURATION
# ----------------------------------------------------------------------

# Default RStudio username and password for startup of container
#PECAN_RSTUDIO_USER=carya
#PECAN_RSTUDIO_PASS=illinois

You can also extend the docker-compose.yml file with a docker-compose.override.yml file (in the same directory), allowing you to add more services, or for example to change where the volumes are stored (see official documentation). For example the following will change the volume for postgres to be stored in your home directory:

version: "3"

volumes:
  postgres:
    driver_opts:
      type: none
      device: ${HOME}/postgres
      o: bind

25.4.3 Top-level structure

The root of the docker-compose.yml file contains three sections:

  • services – This is a list of services provided by the application, with each service corresponding to a container. When communicating with each other internally, the hostnames of containers correspond to their names in this section. For instance, regardless of the “project” name passed to docker-compose up, the hostname for connecting to the PostgreSQL database of any given container is always going to be postgres (e.g. you should be able to access the PostgreSQL database by calling the following from inside the container: psql -d bety -U bety -h postgres). The services comprising the PEcAn application are described below.

  • networks – This is a list of networks used by the application. Containers can only communicate with each other (via ports and hostnames) if they are on the same Docker network, and containers on different networks can only communicate through ports exposed by the host machine. We just provide the network name (pecan) and resort to Docker’s default network configuration. Note that the services we want connected to this network include a networks: ... - pecan tag. For more details on Docker networks, see the official documentation.

  • volumes – Similarly to networks, this just contains a list of volume names we want. Briefly, in Docker, volumes are directories containing files that are meant to be shared across containers. Each volume corresponds to a directory, which can be mounted at a specific location by different containers. For example, syntax like volumes: ... - pecan:/data in a service definition means to mount the pecan “volume” (including its contents) in the /data directory of that container. Volumes also allow data to persist on containers between restarts, as normally, any data created by a container during its execution is lost when the container is re-launched. For example, using a volume for the database allows data to be saved between different runs of the database container. Without volumes, we would start with a blank database every time we restart the containers. For more details on Docker volumes, see the official documentation. Here, we define three volumes:

    • postgres – This contains the data files underlying the PEcAn PostgreSQL database (BETY). Notice that it is mounted by the postgres container to /var/lib/postgresql/data. This is the data that we pre-populate when we run the Docker commands to initialize the PEcAn database. Note that these are the values stored directly in the PostgreSQL database. The default files to which the database points (i.e. dbfiles) are stored in the pecan volume, described below.

    • rabbitmq – This volume contains persistent data for RabbitMQ. It is only used by the rabbitmq service.

    • pecan – This volume contains PEcAn’s dbfiles, which include downloaded and converted model inputs, processed configuration files, and outputs. It is used by almost all of the services in the PEcAn stack, and is typically mounted to /data.

25.4.4 traefik

Traefik manages communication among the different PEcAn services and between PEcAn and the web. Among other things, traefik facilitates the setup of web access to each PEcAn service via common and easy-to-remember URLs. For instance, the following lines in the web service configure access to the PEcAn web interface via the URL http://pecan.localhost/pecan/ :

~

(Further details in the works…)

The traefik service configuration looks like this:

traefik:
  hostname: traefik
  image: traefik:v2.9
  command:
  - --log.level=INFO
  - --api=true
  - --api.dashboard=true
  - --entrypoints.web.address=:80
  - --providers.docker=true
  - --providers.docker.endpoint=unix:///var/run/docker.sock
  - --providers.docker.exposedbydefault=false
  - --providers.docker.watch=true
  restart: unless-stopped
  networks: pecan
  security_opt: no-new-privileges:true
  ports: ${TRAEFIK_HTTP_PORT-80}:80
  volumes:
  - traefik:/config
  - /var/run/docker.sock:/var/run/docker.sock:ro
  labels:
  - traefik.enable=true
  - traefik.http.routers.traefik.entrypoints=web
  - traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
  - traefik.http.routers.traefik.service=api@internal

25.4.5 portainer

portainer is lightweight management UI that allows you to manage the docker host (or swarm). You can use this service to monitor the different containers, see the logfiles, and start and stop containers.

The portainer service configuration looks like this:

NA: ~

Portainer is accessible by browsing to pecan.localhost/portainer/. You can either set the password in the .env file (for an example see env.example) or you can use the web browser and go to the portainer url. If this is the first time it will ask for your password.

25.4.6 minio

Minio is a service that provides access to the a folder on disk through a variety of protocols, including S3 buckets and web-based access. We mainly use Minio to facilitate access to PEcAn data using a web browser without the need for CLI tools.

Our current configuration is as follows:

NA: ~

The Minio interface is accessible by browsing to minio.pecan.localhost. From there, you can browse directories and download files. You can also upload files by clicking the red “+” in the bottom-right corner.

Note that it is currently impossible to create or upload directories using the Minio interface (except in the /data root directory – those folders are called “buckets” in Minio). Therefore, the recommended way to perform any file management tasks other than individual file uploads is through the command line, e.g.

docker run -it --rm --volumes pecan_pecan:/data --volumes /path/to/local/directory:/localdir ubuntu

# Now, you can move files between `/data` and `/localdir`, create new directories, etc.

25.4.7 thredds

This service allows PEcAn model outputs to be accessible via the THREDDS data server (TDS). When the PEcAn stack is running, the catalog can be explored in a web browser at http://pecan.localhost/thredds/catalog.html. Specific output files can also be accessed from the command line via commands like the following:

Note that everything after outputs/ exactly matches the directory structure of the workflows directory.

Which files are served, which subsetting services are available, and other aspects of the data server’s behavior are configured in the docker/thredds_catalog.xml file. Specifically, this XML tells the data server to use the datasetScan tool to serve all files within the /data/workflows directory, with the additional filter that only files ending in .nc are served. For additional information about the syntax of this file, see the extensive THREDDS documentation.

Our current configuration is as follows:

NA: ~

25.4.8 postgres

This service provides a working PostGIS database. Our configuration is fairly straightforward:

postgres:
  hostname: postgres
  image: mdillon/postgis:9.5
  restart: unless-stopped
  networks: pecan
  volumes: postgres:/var/lib/postgresql/data
  healthcheck:
    test: pg_isready -U postgres
    interval: 10s
    timeout: 5s
    retries: 5

Some additional details about our configuration:

  • image – This pulls a container with PostgreSQL + PostGIS pre-installed. Note that by default, we use PostgreSQL version 9.5. To experiment with other versions, you can change 9.5 accordingly.

  • networks – This allows PostgreSQL to communicate with other containers on the pecan network. As mentioned above, the hostname of this service is just its name, i.e. postgres, so to connect to the database from inside a running container, use a command like the following: psql -d bety -U bety -h postgres

  • volumes – Note that the PostgreSQL data files (which store the values in the SQL database) are stored on a volume called postgres (which is not the same as the postgres service, even though they share the same name).

25.4.9 rabbitmq

RabbitMQ is a message broker service. In PEcAn, RabbitMQ functions as a task manager and scheduler, coordinating the execution of different tasks (such as running models and analyzing results) associated with the PEcAn workflow.

Our configuration is as follows:

rabbitmq:
  hostname: rabbitmq
  image: rabbitmq:3.8-management
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
  - RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
  labels:
  - traefik.enable=true
  - traefik.http.services.rabbitmq.loadbalancer.server.port=15672
  - traefik.http.routers.rabbitmq.entrypoints=web
  - traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
  volumes: rabbitmq:/var/lib/rabbitmq
  healthcheck:
    test: rabbitmqctl ping
    interval: 10s
    timeout: 5s
    retries: 5

Note that the traefik.http.routers.rabbitmq.rule indicates that browsing to http://rabbitmq.pecan.localhost/ leads to the RabbitMQ management console.

By default, the RabbitMQ management console has username/password guest/guest, which is highly insecure. For production instances of PEcAn, we highly recommend changing these credentials to something more secure, and removing access to the RabbitMQ management console via Traefik.

25.4.10 bety

This service operates the BETY web interface, which is effectively a web-based front-end to the PostgreSQL database. Unlike the postgres service, which contains all the data needed to run PEcAn models, this service is not essential to the PEcAn workflow. However, note that certain features of the PEcAn web interface do link to the BETY web interface and will not work if this container is not running.

Our configuration is as follows:

bety:
  hostname: bety
  image: pecan/bety:${BETY_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - UNICORN_WORKER_PROCESSES=1
  - SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
  - RAILS_RELATIVE_URL_ROOT=/bety
  - LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
  volumes: bety:/home/bety/log
  depends_on:
    postgres:
      condition: service_healthy
  labels:
  - traefik.enable=true
  - traefik.http.services.bety.loadbalancer.server.port=8000
  - traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
  healthcheck:
    test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT} >
      /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

The BETY container Dockerfile is located in the root directory of the BETY GitHub repository (direct link).

25.4.11 docs

This service will show the documentation for the version of PEcAn running as well as a homepage with links to all relevant endpoints. You can access this at http://pecan.localhost/. You can find the documentation for PEcAn at http://pecan.localhost/docs/pecan/.

Our current configuration is as follows:

docs:
  hostname: docs
  image: pecan/docs:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  labels:
  - traefik.enable=true
  - traefik.http.services.docs.loadbalancer.server.port=80
  - traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
  healthcheck:
    test: curl --silent --fail http://localhost/ > /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

25.4.12 web

This service runs the PEcAn web interface. It is effectively a thin wrapper around a standard Apache web server container from Docker Hub that installs some additional dependencies and copies over the necessary files from the PEcAn source code.

Our configuration is as follows:

NA: ~

Its Dockerfile ships with the PEcAn source code, in docker/web/Dockerfile.

In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the web interface (i.e. PHP, HTML, and JavaScript code located in the PEcAn web directory).

25.4.13 executor

This service is in charge of running the R code underlying the core PEcAn workflow. However, it is not in charge of executing the models themselves – model binaries are located on their own dedicated Docker containers, and model execution is coordinated by RabbitMQ.

Our configuration is as follows:

executor:
  hostname: executor
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/executor:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - RABBITMQ_PREFIX=/
  - RABBITMQ_PORT=15672
  - FQDN=${PECAN_FQDN:-docker}
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  volumes: pecan:/data

Its Dockerfile is ships with the PEcAn source code, in docker/executor/Dockerfile. Its image is built on top of the pecan/base image (docker/base/Dockerfile), which contains the actual PEcAn source. To facilitate caching, the pecan/base image is itself built on top of the pecan/depends image (docker/depends/Dockerfile), a large image that contains an R installation and PEcAn’s many system and R package dependencies (which usually take ~30 minutes or longer to install from scratch).

In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the PEcAn R source code. Note that, unlike changes to the web image’s PHP code, changes to the R source code do not immediately propagate to the PEcAn container; instead, you have to re-compile the code by running make inside the container.

25.4.14 monitor

This service will show all models that are currently running http://pecan.localhost/monitor/. This list returned is JSON and shows all models (grouped by type and version) that are currently running, or where seen in the past. This list will also contain a list of all current active containers, as well as how many jobs are waiting to be processed.

This service is also responsible for registering any new models with PEcAn so users can select it and execute the model from the web interface.

Our current configuration is as follows:

monitor:
  hostname: monitor
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/monitor:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - FQDN=${PECAN_FQDN:-docker}
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  labels:
  - traefik.enable=true
  - traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
    PathPrefix(`/monitor/`)
  - traefik.http.routers.monitor.middlewares=monitor-stripprefix
  - traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
  volumes: pecan:/data
  healthcheck:
    test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

25.4.15 Model-specific containers

Additional models are added as additional services. In general, their configuration should be similar to the following configuration for SIPNET, which ships with PEcAn:

sipnet:
  hostname: sipnet-git
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  depends_on:
    rabbitmq:
      condition: service_healthy
  volumes: pecan:/data

The PEcAn source contains Dockerfiles for ED2 (models/ed/Dockerfile) and SIPNET (models/sipnet/Dockerfile) that can serve as references. For additional tips on constructing a Dockerfile for your model, see Dockerfiles for Models.