8.8 PEcAn Docker Architecture

8.8.1 Overview

The PEcAn docker architecture consists of many containers (see figure below) that will communicate with each other. The goal of this architecture is to easily expand the PEcAn system by deploying new model containers and registering them with PEcAn. Once this is done the user can now use these new models in their work. The PEcAn framework will setup the configurations for the models, and send a message to the model containers to start execution. Once the execution is finished the PEcAn framework will continue. This is exactly as if the model is running on a HPC machine. Models can be executed in parallel by launching multiple model containers.

As can be seen in the figure the architecture leverages two standard containers (in orange). The first container is PostgreSQL with PostGIS (mdillon/postgis) which is used to store the database used by both BETY and PEcAn. The second container is a message bus, more specifically RabbitMQ (rabbitmq).

The BETY app container (pecan/bety) is the front end to the BETY database and is connected to the PostgreSQL container.

The PEcAn framework containers consist of multiple unique ways to interact with the PEcAn system (none of these containers will have any models installed):

  • PEcAn RStudio is an RStudio Server environment with the PEcAn libraries preloaded. This allows for prototyping of new algorithms that can be used as part of the PEcAn framework later.
  • PEcAn web allows the user to create a new PEcAn workflow. The workflow is stored in the database, and the models are executed by the model containers.
  • PEcAn API provides a RESTful interface for programmatic access to PEcAn workflows, models, and data.
  • PEcAn monitor tracks running models and registers new model containers with PEcAn.

The model containers contain the actual models that are executed as well as small wrappers to make them work in the PEcAn framework. The containers will run the model based on the parameters received from the message bus and convert the outputs back to the standard PEcAn output format. Once the container is finished processing a message it will immediately get the next message and start processing it.

8.8.2 PEcAn’s docker-compose

The PEcAn Docker architecture is described in full by the PEcAn docker-compose.yml file. For full docker-compose syntax, see the official documentation.

This section describes the top-level structure and each of the services, which are as follows:

For reference, the complete docker-compose file is as follows:

services:
  traefik:
    hostname: traefik
    image: traefik:v2.9
    command:
    - --log.level=INFO
    - --api=true
    - --api.dashboard=true
    - --entrypoints.web.address=:80
    - --providers.docker=true
    - --providers.docker.endpoint=unix:///var/run/docker.sock
    - --providers.docker.exposedbydefault=false
    - --providers.docker.watch=true
    restart: unless-stopped
    networks: pecan
    security_opt: no-new-privileges:true
    ports: ${TRAEFIK_HTTP_PORT-80}:80
    volumes:
    - traefik:/config
    - /var/run/docker.sock:/var/run/docker.sock:ro
    labels:
    - traefik.enable=true
    - traefik.http.routers.traefik.entrypoints=web
    - traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
    - traefik.http.routers.traefik.service=api@internal
  rabbitmq:
    hostname: rabbitmq
    image: rabbitmq:3.8-management
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
    - RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
    labels:
    - traefik.enable=true
    - traefik.http.services.rabbitmq.loadbalancer.server.port=15672
    - traefik.http.routers.rabbitmq.entrypoints=web
    - traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
    volumes: rabbitmq:/var/lib/rabbitmq
    healthcheck:
      test: rabbitmqctl ping
      interval: 10s
      timeout: 5s
      retries: 5
  postgres:
    hostname: postgres
    image: mdillon/postgis:9.5
    restart: unless-stopped
    networks: pecan
    volumes: postgres:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready -U postgres
      interval: 10s
      timeout: 5s
      retries: 5
  bety:
    hostname: bety
    image: pecan/bety:${BETY_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - UNICORN_WORKER_PROCESSES=1
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
    - RAILS_RELATIVE_URL_ROOT=/bety
    - LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
    volumes: bety:/home/bety/log
    depends_on:
      postgres:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.services.bety.loadbalancer.server.port=8000
    - traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
    healthcheck:
      test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT}
        > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  rstudio:
    hostname: rstudio
    platform: linux/amd64
    image: pecan/base:${PECAN_VERSION:-latest}
    command: /work/rstudio.sh
    restart: unless-stopped
    networks: pecan
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    environment:
    - KEEP_ENV=RABBITMQ_URI RABBITMQ_PREFIX RABBITMQ_PORT FQDN NAME
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - RABBITMQ_PREFIX=/
    - RABBITMQ_PORT=15672
    - FQDN=${PECAN_FQDN:-docker}
    - NAME=${PECAN_NAME:-docker}
    - DEFAULT_USER=${PECAN_RSTUDIO_USER:-carya}
    - PASSWORD=${PECAN_RSTUDIO_PASS:-illinois}
    - USERID=${UID:-1001}
    - GROUPID=${GID:-1001}
    volumes:
    - pecan:/data
    - rstudio:/home
    labels:
    - traefik.enable=true
    - traefik.http.routers.rstudio.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/rstudio/`)
    - traefik.http.routers.rstudio.service=rstudio
    - traefik.http.routers.rstudio.middlewares=rstudio-stripprefix,rstudio-headers
    - traefik.http.services.rstudio.loadbalancer.server.port=8787
    - traefik.http.middlewares.rstudio-headers.headers.customrequestheaders.X-RStudio-Root-Path=/rstudio
    - traefik.http.middlewares.rstudio-stripprefix.stripprefix.prefixes=/rstudio
    - traefik.http.routers.rstudio-local.entrypoints=web
    - traefik.http.routers.rstudio-local.rule=Host(`rstudio.pecan.localhost`)
    - traefik.http.routers.rstudio-local.service=rstudio-local
    - traefik.http.services.rstudio-local.loadbalancer.server.port=8787
  docs:
    hostname: docs
    image: pecan/docs:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    labels:
    - traefik.enable=true
    - traefik.http.services.docs.loadbalancer.server.port=80
    - traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
    healthcheck:
      test: curl --silent --fail http://localhost/ > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  pecan:
    hostname: pecan-web
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/web:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - FQDN=${PECAN_FQDN:-docker}
    - NAME=${PECAN_NAME:-docker}
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.services.pecan.loadbalancer.server.port=8080
    - traefik.http.routers.pecan.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/pecan/`)
    volumes:
    - pecan:/data
    - pecan:/var/www/html/pecan/data
    healthcheck:
      test: curl --silent --fail http://localhost:8080/pecan > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  monitor:
    hostname: monitor
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/monitor:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - FQDN=${PECAN_FQDN:-docker}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    labels:
    - traefik.enable=true
    - traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
      PathPrefix(`/monitor/`)
    - traefik.http.routers.monitor.middlewares=monitor-stripprefix
    - traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
    volumes: pecan:/data
    healthcheck:
      test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
      interval: 10s
      timeout: 5s
      retries: 5
  executor:
    hostname: executor
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/executor:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    environment:
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - RABBITMQ_PREFIX=/
    - RABBITMQ_PORT=15672
    - FQDN=${PECAN_FQDN:-docker}
    depends_on:
      postgres:
        condition: service_healthy
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  fates:
    hostname: fates
    user: ${UID:-1001}:${GID:-1001}
    image: ghcr.io/noresmhub/ctsm-api:latest
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  basgra:
    hostname: basgra
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-basgra-basgra_n_v1:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  sipnet:
    hostname: sipnet-git
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  ed2:
    hostname: ed2-2_2_0
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-ed2-2.2.0:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  maespa:
    hostname: maespa-git
    platform: linux/amd64
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-maespa-git:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  biocro:
    hostname: biocro-0_95
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/model-biocro-0.95:${PECAN_VERSION:-latest}
    platform: linux/amd64
    restart: unless-stopped
    networks: pecan
    environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    depends_on:
      rabbitmq:
        condition: service_healthy
    volumes: pecan:/data
  api:
    hostname: api
    platform: linux/amd64
    user: ${UID:-1001}:${GID:-1001}
    image: pecan/api:${PECAN_VERSION:-latest}
    restart: unless-stopped
    networks: pecan
    environment:
    - PGHOST=${PGHOST:-postgres}
    - HOST_ONLY=${HOST_ONLY:-FALSE}
    - AUTH_REQ=${AUTH_REQ:-FALSE}
    - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
    - DATA_DIR=${DATA_DIR:-/data/}
    - DBFILES_DIR=${DBFILES_DIR:-/data/dbfiles/}
    - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
    labels:
    - traefik.enable=true
    - traefik.http.routers.api.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/api/`)
    - traefik.http.services.api.loadbalancer.server.port=8000
    depends_on:
      postgres:
        condition: service_healthy
    volumes: pecan:/data/
    healthcheck:
      test: curl --silent --fail http://localhost:8000/api/ping > /dev/null || exit
        1
      interval: 10s
      timeout: 5s
      retries: 5
networks:
  pecan: ~
volumes:
  traefik: ~
  postgres: ~
  bety: ~
  rabbitmq: ~
  pecan: ~
  rstudio: ~

There are two ways you can override different values in the docker-compose.yml file. The first method is to create a file called .env that is placed in the same folder as the docker-compose.yml file. This file can override some of configuration variables used by docker-compose. For example the following is an example of the env file

# This file will override the configuration options in the docker-compose
# file. Copy this file to the same folder as docker-compose as .env

# ----------------------------------------------------------------------
# GENERAL CONFIGURATION
# ----------------------------------------------------------------------

# project name (-p flag for docker-compose)
#COMPOSE_PROJECT_NAME=pecan

# ----------------------------------------------------------------------
# TRAEFIK CONFIGURATION
# ----------------------------------------------------------------------

# hostname of server
#TRAEFIK_HOST=pecan-docker.ncsa.illinois.edu

# Run traffik on port 80 (http) and port 443 (https)
#TRAEFIK_HTTP_PORT=80
#TRAEFIK_HTTPS_PORT=443

# Use you real email address here to be notified if cert expires
#TRAEFIK_ACME_EMAIL=pecanproj@gmail.com

# ----------------------------------------------------------------------
# PEcAn CONFIGURATION
# ----------------------------------------------------------------------

# what version of pecan to use
#PECAN_VERSION=develop

# the fully qualified hostname used for this server
#PECAN_FQDN=pecan-docker.ncsa.illinois.edu

# short name shown in the menu
#PECAN_NAME=pecan-docker

# ----------------------------------------------------------------------
# BETY CONFIGURATION
# ----------------------------------------------------------------------

# what version of BETY to use
#BETY_VERSION=develop

# what is our server number, 99=vm, 98=docker
#BETY_LOCAL_SERVER=98

# secret used to encrypt cookies in BETY
#BETY_SECRET_KEY=1208q7493e8wfhdsohfo9ewhrfiouaho908ruq30oiewfdjspadosuf08q345uwrasdy98t7q243

# ----------------------------------------------------------------------
# MINIO CONFIGURATION
# ----------------------------------------------------------------------

# minio username and password
#MINIO_ACCESS_KEY=carya
#MINIO_SECRET_KEY=illinois

# ----------------------------------------------------------------------
# PORTAINER CONFIGURATION
# ----------------------------------------------------------------------

# password for portainer admin account
# use docker run --rm httpd:2.4-alpine htpasswd -nbB admin <password> | cut -d ":" -f 2
#PORTAINER_PASSWORD=$2y$05$5meDPBtS3NNxyGhBpYceVOxmFhiiC3uY5KEy2m0YRbWghhBr2EVn2

# ----------------------------------------------------------------------
# RABBITMQ CONFIGURATION
# ----------------------------------------------------------------------

# RabbitMQ username and password
#RABBITMQ_DEFAULT_USER=carya
#RABBITMQ_DEFAULT_PASS=illinois

# create the correct URI with above username and password
#RABBITMQ_URI=amqp://carya:illinois@rabbitmq/%2F

# ----------------------------------------------------------------------
# RSTUDIO CONFIGURATION
# ----------------------------------------------------------------------

# Default RStudio username and password for startup of container
#PECAN_RSTUDIO_USER=carya
#PECAN_RSTUDIO_PASS=illinois

You can also extend the docker-compose.yml file with a docker-compose.override.yml file (in the same directory), allowing you to add more services, or for example to change where the volumes are stored (see official documentation). For example the following will change the volume for postgres to be stored in your home directory:

volumes:
  postgres:
    driver_opts:
      type: none
      device: ${HOME}/postgres
      o: bind

8.8.3 Top-level structure

The root of the docker-compose.yml file contains three sections:

  • services – This is a list of services provided by the application, with each service corresponding to a container. When communicating with each other internally, the hostnames of containers correspond to their names in this section. For instance, regardless of the “project” name passed to docker-compose up, the hostname for connecting to the PostgreSQL database of any given container is always going to be postgres (e.g. you should be able to access the PostgreSQL database by calling the following from inside the container: psql -d bety -U bety -h postgres). The services comprising the PEcAn application are described below.

  • networks – This is a list of networks used by the application. Containers can only communicate with each other (via ports and hostnames) if they are on the same Docker network, and containers on different networks can only communicate through ports exposed by the host machine. We just provide the network name (pecan) and resort to Docker’s default network configuration. Note that the services we want connected to this network include a networks: ... - pecan tag. For more details on Docker networks, see the official documentation.

  • volumes – Similarly to networks, this just contains a list of volume names we want. Briefly, in Docker, volumes are directories containing files that are meant to be shared across containers. Each volume corresponds to a directory, which can be mounted at a specific location by different containers. For example, syntax like volumes: ... - pecan:/data in a service definition means to mount the pecan “volume” (including its contents) in the /data directory of that container. Volumes also allow data to persist on containers between restarts, as normally, any data created by a container during its execution is lost when the container is re-launched. For example, using a volume for the database allows data to be saved between different runs of the database container. Without volumes, we would start with a blank database every time we restart the containers. For more details on Docker volumes, see the official documentation. Here, we define six volumes:

    • traefik – This contains configuration data for the Traefik reverse proxy. It is only used by the traefik service.

    • postgres – This contains the data files underlying the PEcAn PostgreSQL database (BETY). Notice that it is mounted by the postgres container to /var/lib/postgresql/data. This is the data that we pre-populate when we run the Docker commands to initialize the PEcAn database. Note that these are the values stored directly in the PostgreSQL database. The default files to which the database points (i.e. dbfiles) are stored in the pecan volume, described below.

    • bety – This volume contains log files for the BETY Rails application. It is only used by the bety service.

    • rabbitmq – This volume contains persistent data for RabbitMQ. It is only used by the rabbitmq service.

    • pecan – This volume contains PEcAn’s dbfiles, which include downloaded and converted model inputs, processed configuration files, and outputs. It is used by almost all of the services in the PEcAn stack, and is typically mounted to /data.

    • rstudio – This volume contains the home directories for RStudio users. It is only used by the rstudio service and is mounted to /home.

8.8.4 traefik

Traefik manages communication among the different PEcAn services and between PEcAn and the web. Among other things, traefik facilitates the setup of web access to each PEcAn service via common and easy-to-remember URLs. The current configuration uses Traefik v2.9 with Docker provider auto-discovery.

Services are exposed via Traefik labels in each service’s configuration. For example, the following lines in the pecan (web) service configure access to the PEcAn web interface via the URL http://pecan.localhost/pecan/ :

labels:
- traefik.enable=true
- traefik.http.services.pecan.loadbalancer.server.port=8080
- traefik.http.routers.pecan.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/pecan/`)

The Traefik dashboard is accessible at http://traefik.pecan.localhost.

Note that HTTPS/SSL configuration is not included in the base docker-compose.yml. For HTTPS support with Let’s Encrypt, see docker-compose.https.yml.

The traefik service configuration looks like this:

traefik:
  hostname: traefik
  image: traefik:v2.9
  command:
  - --log.level=INFO
  - --api=true
  - --api.dashboard=true
  - --entrypoints.web.address=:80
  - --providers.docker=true
  - --providers.docker.endpoint=unix:///var/run/docker.sock
  - --providers.docker.exposedbydefault=false
  - --providers.docker.watch=true
  restart: unless-stopped
  networks: pecan
  security_opt: no-new-privileges:true
  ports: ${TRAEFIK_HTTP_PORT-80}:80
  volumes:
  - traefik:/config
  - /var/run/docker.sock:/var/run/docker.sock:ro
  labels:
  - traefik.enable=true
  - traefik.http.routers.traefik.entrypoints=web
  - traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
  - traefik.http.routers.traefik.service=api@internal

8.8.5 postgres

This service provides a working PostGIS database. Our configuration is fairly straightforward:

postgres:
  hostname: postgres
  image: mdillon/postgis:9.5
  restart: unless-stopped
  networks: pecan
  volumes: postgres:/var/lib/postgresql/data
  healthcheck:
    test: pg_isready -U postgres
    interval: 10s
    timeout: 5s
    retries: 5

Some additional details about our configuration:

  • image – This pulls a container with PostgreSQL + PostGIS pre-installed. Note that by default, we use PostgreSQL version 9.5. To experiment with other versions, you can change 9.5 accordingly.

  • networks – This allows PostgreSQL to communicate with other containers on the pecan network. As mentioned above, the hostname of this service is just its name, i.e. postgres, so to connect to the database from inside a running container, use a command like the following: psql -d bety -U bety -h postgres

  • volumes – Note that the PostgreSQL data files (which store the values in the SQL database) are stored on a volume called postgres (which is not the same as the postgres service, even though they share the same name).

8.8.6 rabbitmq

RabbitMQ is a message broker service. In PEcAn, RabbitMQ functions as a task manager and scheduler, coordinating the execution of different tasks (such as running models and analyzing results) associated with the PEcAn workflow.

Our configuration is as follows:

rabbitmq:
  hostname: rabbitmq
  image: rabbitmq:3.8-management
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
  - RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
  labels:
  - traefik.enable=true
  - traefik.http.services.rabbitmq.loadbalancer.server.port=15672
  - traefik.http.routers.rabbitmq.entrypoints=web
  - traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
  volumes: rabbitmq:/var/lib/rabbitmq
  healthcheck:
    test: rabbitmqctl ping
    interval: 10s
    timeout: 5s
    retries: 5

Note that the traefik.http.routers.rabbitmq.rule indicates that browsing to http://rabbitmq.pecan.localhost/ leads to the RabbitMQ management console.

By default, the RabbitMQ management console has username/password guest/guest, which is highly insecure. For production instances of PEcAn, we highly recommend changing these credentials to something more secure, and removing access to the RabbitMQ management console via Traefik.

8.8.7 bety

This service operates the BETY web interface, which is effectively a web-based front-end to the PostgreSQL database. Unlike the postgres service, which contains all the data needed to run PEcAn models, this service is not essential to the PEcAn workflow. However, note that certain features of the PEcAn web interface do link to the BETY web interface and will not work if this container is not running.

Our configuration is as follows:

bety:
  hostname: bety
  image: pecan/bety:${BETY_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - UNICORN_WORKER_PROCESSES=1
  - SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
  - RAILS_RELATIVE_URL_ROOT=/bety
  - LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
  volumes: bety:/home/bety/log
  depends_on:
    postgres:
      condition: service_healthy
  labels:
  - traefik.enable=true
  - traefik.http.services.bety.loadbalancer.server.port=8000
  - traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
  healthcheck:
    test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT} >
      /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

The BETY container Dockerfile is located in the root directory of the BETY GitHub repository (direct link).

8.8.8 rstudio

This service provides an RStudio Server environment with PEcAn libraries pre-installed. It is useful for interactively exploring PEcAn data, prototyping new analyses, and debugging workflows.

RStudio is accessible at http://pecan.localhost/rstudio/ or directly at http://rstudio.pecan.localhost. The default credentials are username carya and password illinois, configurable via the PECAN_RSTUDIO_USER and PECAN_RSTUDIO_PASS environment variables in the .env file.

Our current configuration is as follows:

rstudio:
  hostname: rstudio
  platform: linux/amd64
  image: pecan/base:${PECAN_VERSION:-latest}
  command: /work/rstudio.sh
  restart: unless-stopped
  networks: pecan
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  environment:
  - KEEP_ENV=RABBITMQ_URI RABBITMQ_PREFIX RABBITMQ_PORT FQDN NAME
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - RABBITMQ_PREFIX=/
  - RABBITMQ_PORT=15672
  - FQDN=${PECAN_FQDN:-docker}
  - NAME=${PECAN_NAME:-docker}
  - DEFAULT_USER=${PECAN_RSTUDIO_USER:-carya}
  - PASSWORD=${PECAN_RSTUDIO_PASS:-illinois}
  - USERID=${UID:-1001}
  - GROUPID=${GID:-1001}
  volumes:
  - pecan:/data
  - rstudio:/home
  labels:
  - traefik.enable=true
  - traefik.http.routers.rstudio.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
    PathPrefix(`/rstudio/`)
  - traefik.http.routers.rstudio.service=rstudio
  - traefik.http.routers.rstudio.middlewares=rstudio-stripprefix,rstudio-headers
  - traefik.http.services.rstudio.loadbalancer.server.port=8787
  - traefik.http.middlewares.rstudio-headers.headers.customrequestheaders.X-RStudio-Root-Path=/rstudio
  - traefik.http.middlewares.rstudio-stripprefix.stripprefix.prefixes=/rstudio
  - traefik.http.routers.rstudio-local.entrypoints=web
  - traefik.http.routers.rstudio-local.rule=Host(`rstudio.pecan.localhost`)
  - traefik.http.routers.rstudio-local.service=rstudio-local
  - traefik.http.services.rstudio-local.loadbalancer.server.port=8787

The RStudio service runs on the pecan/base image with a startup script (/work/rstudio.sh). It mounts both the pecan volume (for data access at /data) and the rstudio volume (for persistent home directories at /home).

8.8.9 docs

This service will show the documentation for the version of PEcAn running as well as a homepage with links to all relevant endpoints. You can access this at http://pecan.localhost/. You can find the documentation for PEcAn at http://pecan.localhost/docs/pecan/.

Our current configuration is as follows:

docs:
  hostname: docs
  image: pecan/docs:${PECAN_VERSION:-latest}
  platform: linux/amd64
  restart: unless-stopped
  networks: pecan
  labels:
  - traefik.enable=true
  - traefik.http.services.docs.loadbalancer.server.port=80
  - traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
  healthcheck:
    test: curl --silent --fail http://localhost/ > /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

8.8.10 pecan (web)

This service runs the PEcAn web interface. It is accessible at http://pecan.localhost/pecan/.

Our configuration is as follows:

pecan:
  hostname: pecan-web
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/web:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - FQDN=${PECAN_FQDN:-docker}
  - NAME=${PECAN_NAME:-docker}
  - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  labels:
  - traefik.enable=true
  - traefik.http.services.pecan.loadbalancer.server.port=8080
  - traefik.http.routers.pecan.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/pecan/`)
  volumes:
  - pecan:/data
  - pecan:/var/www/html/pecan/data
  healthcheck:
    test: curl --silent --fail http://localhost:8080/pecan > /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

Its Dockerfile ships with the PEcAn source code, in docker/web/Dockerfile.

In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the web interface (i.e. PHP, HTML, and JavaScript code located in the PEcAn web directory).

8.8.11 executor

This service is in charge of running the R code underlying the core PEcAn workflow. However, it is not in charge of executing the models themselves – model binaries are located on their own dedicated Docker containers, and model execution is coordinated by RabbitMQ.

Our configuration is as follows:

executor:
  hostname: executor
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/executor:${PECAN_VERSION:-latest}
  platform: linux/amd64
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - RABBITMQ_PREFIX=/
  - RABBITMQ_PORT=15672
  - FQDN=${PECAN_FQDN:-docker}
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  volumes: pecan:/data

Its Dockerfile ships with the PEcAn source code, in docker/executor/Dockerfile. Its image is built on top of the pecan/base image (docker/base/Dockerfile), which contains the actual PEcAn source. To facilitate caching, the pecan/base image is itself built on top of the pecan/depends image (docker/depends/Dockerfile), a large image that contains an R installation and PEcAn’s many system and R package dependencies (which usually take ~30 minutes or longer to install from scratch).

In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the PEcAn R source code. Note that, unlike changes to the web image’s PHP code, changes to the R source code do not immediately propagate to the PEcAn container; instead, you have to re-compile the code by running make inside the container.

8.8.12 monitor

This service will show all models that are currently running http://pecan.localhost/monitor/. This list returned is JSON and shows all models (grouped by type and version) that are currently running, or where seen in the past. This list will also contain a list of all current active containers, as well as how many jobs are waiting to be processed.

This service is also responsible for registering any new models with PEcAn so users can select it and execute the model from the web interface.

Our current configuration is as follows:

monitor:
  hostname: monitor
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/monitor:${PECAN_VERSION:-latest}
  platform: linux/amd64
  restart: unless-stopped
  networks: pecan
  environment:
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - FQDN=${PECAN_FQDN:-docker}
  depends_on:
    postgres:
      condition: service_healthy
    rabbitmq:
      condition: service_healthy
  labels:
  - traefik.enable=true
  - traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
    PathPrefix(`/monitor/`)
  - traefik.http.routers.monitor.middlewares=monitor-stripprefix
  - traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
  volumes: pecan:/data
  healthcheck:
    test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
    interval: 10s
    timeout: 5s
    retries: 5

8.8.13 api

This service provides the PEcAn RESTful API, which allows programmatic access to PEcAn workflows, models, and data. The API is accessible at http://pecan.localhost/api/. API documentation and a health check endpoint are available at /api/ping.

By default, authentication is disabled (AUTH_REQ=FALSE). For production deployments, set AUTH_REQ=TRUE in the .env file to require authentication.

Our current configuration is as follows:

api:
  hostname: api
  platform: linux/amd64
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/api:${PECAN_VERSION:-latest}
  restart: unless-stopped
  networks: pecan
  environment:
  - PGHOST=${PGHOST:-postgres}
  - HOST_ONLY=${HOST_ONLY:-FALSE}
  - AUTH_REQ=${AUTH_REQ:-FALSE}
  - RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  - DATA_DIR=${DATA_DIR:-/data/}
  - DBFILES_DIR=${DBFILES_DIR:-/data/dbfiles/}
  - SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
  labels:
  - traefik.enable=true
  - traefik.http.routers.api.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/api/`)
  - traefik.http.services.api.loadbalancer.server.port=8000
  depends_on:
    postgres:
      condition: service_healthy
  volumes: pecan:/data/
  healthcheck:
    test: curl --silent --fail http://localhost:8000/api/ping > /dev/null || exit
      1
    interval: 10s
    timeout: 5s
    retries: 5

For more details on using the PEcAn API, see the PEcAn API documentation.

8.8.14 Model-specific containers

Additional models are added as additional services. The following models ship with PEcAn by default:

  • SIPNET (sipnet) – pecan/model-sipnet-git
  • ED2 (ed2) – pecan/model-ed2-2.2.0
  • FATES (fates) – ghcr.io/noresmhub/ctsm-api
  • BASGRA (basgra) – pecan/model-basgra-basgra_n_v1
  • MAESPA (maespa) – pecan/model-maespa-git
  • BioCro (biocro) – pecan/model-biocro-0.95

In general, their configuration should be similar to the following configuration for SIPNET:

sipnet:
  hostname: sipnet-git
  user: ${UID:-1001}:${GID:-1001}
  image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
  platform: linux/amd64
  restart: unless-stopped
  networks: pecan
  environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
  depends_on:
    rabbitmq:
      condition: service_healthy
  volumes: pecan:/data

The PEcAn source contains Dockerfiles for ED2 (models/ed/Dockerfile) and SIPNET (models/sipnet/Dockerfile) that can serve as references. For additional tips on constructing a Dockerfile for your model, see Dockerfiles for Models.