25.4 PEcAn Docker Architecture
- Overview
- PEcAn’s
docker-compose
- Top-level structure
traefik
portainer
minio
thredds
postgres
rabbitmq
bety
docs
web
executor
monitor
- Model-specific containers
25.4.1 Overview
The PEcAn docker architecture consists of many containers (see figure below) that will communicate with each other. The goal of this architecture is to easily expand the PEcAn system by deploying new model containers and registering them with PEcAn. Once this is done the user can now use these new models in their work. The PEcAn framework will setup the configurations for the models, and send a message to the model containers to start execution. Once the execution is finished the PEcAn framework will continue. This is exactly as if the model is running on a HPC machine. Models can be executed in parallel by launching multiple model containers.
As can be seen in the figure the architecture leverages of two standard containers (in orange). The first container is postgresql with postgis (mdillon/postgis) which is used to store the database used by both BETY and PEcAn. The second containers is a messagebus, more specifically RabbitMQ (rabbitmq).
The BETY app container (pecan/bety) is the front end to the BETY database and is connected to the postgresql container. A http server can be put in front of this container for SSL termination as well to allow for load balancing (by using multiple BETY app containers).
The PEcAn framework containers consist of multiple unique ways to interact with the PEcAn system (none of these containers will have any models installed):
- PEcAn shiny hosts the shiny applications developed and will interact with the database to get all information necessary to display
- PEcAn rstudio is a rstudio environment with the PEcAn libraries preloaded. This allows for prototyping of new algorithms that can be used as part of the PEcAn framework later.
- PEcAn web allows the user to create a new PEcAn workflow. The workflow is stored in the database, and the models are executed by the model containers.
- PEcAn cli will allow the user to give a pecan.xml file that will be executed by the PEcAn framework. The workflow created from the XML file is stored in the database, and the models are executed by the model containers.
The model containers contain the actual models that are executed as well as small wrappers to make them work in the PEcAn framework. The containers will run the model based on the parameters received from the message bus and convert the outputs back to the standard PEcAn output format. Once the container is finished processing a message it will immediatly get the next message and start processing it.
25.4.2 PEcAn’s docker-compose
The PEcAn Docker architecture is described in full by the PEcAn docker-compose.yml
file.
For full docker-compose
syntax, see the official documentation.
This section describes the top-level structure and each of the services, which are as follows:
traefik
portainer
minio
thredds
postgres
rabbitmq
bety
docs
web
executor
monitor
- Model-specific services
For reference, the complete docker-compose
file is as follows:
services:
traefik:
hostname: traefik
image: traefik:v2.9
command:
- --log.level=INFO
- --api=true
- --api.dashboard=true
- --entrypoints.web.address=:80
- --providers.docker=true
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.docker.exposedbydefault=false
- --providers.docker.watch=true
restart: unless-stopped
networks: pecan
security_opt: no-new-privileges:true
ports: ${TRAEFIK_HTTP_PORT-80}:80
volumes:
- traefik:/config
- /var/run/docker.sock:/var/run/docker.sock:ro
labels:
- traefik.enable=true
- traefik.http.routers.traefik.entrypoints=web
- traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
- traefik.http.routers.traefik.service=api@internal
rabbitmq:
hostname: rabbitmq
image: rabbitmq:3.8-management
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
labels:
- traefik.enable=true
- traefik.http.services.rabbitmq.loadbalancer.server.port=15672
- traefik.http.routers.rabbitmq.entrypoints=web
- traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
volumes: rabbitmq:/var/lib/rabbitmq
healthcheck:
test: rabbitmqctl ping
interval: 10s
timeout: 5s
retries: 5
postgres:
hostname: postgres
image: mdillon/postgis:9.5
restart: unless-stopped
networks: pecan
volumes: postgres:/var/lib/postgresql/data
healthcheck:
test: pg_isready -U postgres
interval: 10s
timeout: 5s
retries: 5
bety:
hostname: bety
image: pecan/bety:${BETY_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- UNICORN_WORKER_PROCESSES=1
- SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
- RAILS_RELATIVE_URL_ROOT=/bety
- LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
volumes: bety:/home/bety/log
depends_on:
postgres:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.services.bety.loadbalancer.server.port=8000
- traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
healthcheck:
test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT}
> /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
rstudio:
hostname: rstudio
image: pecan/base:${PECAN_VERSION:-latest}
command: /work/rstudio.sh
restart: unless-stopped
networks: pecan
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
environment:
- KEEP_ENV=RABBITMQ_URI RABBITMQ_PREFIX RABBITMQ_PORT FQDN NAME
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- RABBITMQ_PREFIX=/
- RABBITMQ_PORT=15672
- FQDN=${PECAN_FQDN:-docker}
- NAME=${PECAN_NAME:-docker}
- USER=${PECAN_RSTUDIO_USER:-carya}
- PASSWORD=${PECAN_RSTUDIO_PASS:-illinois}
- USERID=${UID:-1001}
- GROUPID=${GID:-1001}
volumes:
- pecan:/data
- rstudio:/home
labels:
- traefik.enable=true
- traefik.http.routers.rstudio.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
PathPrefix(`/rstudio/`)
- traefik.http.routers.rstudio.service=rstudio
- traefik.http.routers.rstudio.middlewares=rstudio-stripprefix,rstudio-headers
- traefik.http.services.rstudio.loadbalancer.server.port=8787
- traefik.http.middlewares.rstudio-headers.headers.customrequestheaders.X-RStudio-Root-Path=/rstudio
- traefik.http.middlewares.rstudio-stripprefix.stripprefix.prefixes=/rstudio
- traefik.http.routers.rstudio-local.entrypoints=web
- traefik.http.routers.rstudio-local.rule=Host(`rstudio.pecan.localhost`)
- traefik.http.routers.rstudio-local.service=rstudio-local
- traefik.http.services.rstudio-local.loadbalancer.server.port=8787
docs:
hostname: docs
image: pecan/docs:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
labels:
- traefik.enable=true
- traefik.http.services.docs.loadbalancer.server.port=80
- traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
healthcheck:
test: curl --silent --fail http://localhost/ > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
pecan:
hostname: pecan-web
user: ${UID:-1001}:${GID:-1001}
image: pecan/web:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- FQDN=${PECAN_FQDN:-docker}
- NAME=${PECAN_NAME:-docker}
- SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.services.pecan.loadbalancer.server.port=8080
- traefik.http.routers.pecan.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
PathPrefix(`/pecan/`)
volumes:
- pecan:/data
- pecan:/var/www/html/pecan/data
healthcheck:
test: curl --silent --fail http://localhost:8080/pecan > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
monitor:
hostname: monitor
user: ${UID:-1001}:${GID:-1001}
image: pecan/monitor:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- FQDN=${PECAN_FQDN:-docker}
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
PathPrefix(`/monitor/`)
- traefik.http.routers.monitor.middlewares=monitor-stripprefix
- traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
volumes: pecan:/data
healthcheck:
test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
executor:
hostname: executor
user: ${UID:-1001}:${GID:-1001}
image: pecan/executor:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- RABBITMQ_PREFIX=/
- RABBITMQ_PORT=15672
- FQDN=${PECAN_FQDN:-docker}
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
volumes: pecan:/data
fates:
hostname: fates
user: ${UID:-1001}:${GID:-1001}
image: ghcr.io/noresmhub/ctsm-api:latest
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
basgra:
hostname: basgra
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-basgra-basgra_n_v1.0:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
sipnet:
hostname: sipnet-git
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
ed2:
hostname: ed2-2_2_0
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-ed2-2.2.0:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
maespa:
hostname: maespa-git
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-maespa-git:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
biocro:
hostname: biocro-0_95
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-biocro-0.95:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
dbsync:
hostname: dbsync
image: pecan/shiny-dbsync:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
depends_on:
postgres:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.routers.dbsync.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
PathPrefix(`/dbsync/`)
- traefik.http.routers.dbsync.middlewares=dbsync-stripprefix
- traefik.http.middlewares.dbsync-stripprefix.stripprefix.prefixes=/monitor
healthcheck:
test: curl --silent --fail http://localhost:3838 > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
api:
hostname: api
user: ${UID:-1001}:${GID:-1001}
image: pecan/api:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- PGHOST=${PGHOST:-postgres}
- HOST_ONLY=${HOST_ONLY:-FALSE}
- AUTH_REQ=${AUTH_REQ:-FALSE}
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- DATA_DIR=${DATA_DIR:-/data/}
- DBFILES_DIR=${DBFILES_DIR:-/data/dbfiles/}
- SECRET_KEY_BASE=${BETY_SECRET_KEY:-thisisnotasecret}
labels:
- traefik.enable=true
- traefik.http.routers.api.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/api/`)
- traefik.http.services.api.loadbalancer.server.port=8000
depends_on:
postgres:
condition: service_healthy
volumes: pecan:/data/
healthcheck:
test: curl --silent --fail http://localhost:8000/api/ping > /dev/null || exit
1
interval: 10s
timeout: 5s
retries: 5
networks:
pecan: ~
volumes:
traefik: ~
postgres: ~
bety: ~
rabbitmq: ~
pecan: ~
rstudio: ~
There are two ways you can override different values in the docker-compose.yml file. The first method is to create a file called .env
that is placed in the same folder as the docker-compose.yml file. This file can override some of configuration variables used by docker-compose. For example the following is an example of the env file
# This file will override the configuration options in the docker-compose
# file. Copy this file to the same folder as docker-compose as .env
# ----------------------------------------------------------------------
# GENERAL CONFIGURATION
# ----------------------------------------------------------------------
# project name (-p flag for docker-compose)
#COMPOSE_PROJECT_NAME=pecan
# ----------------------------------------------------------------------
# TRAEFIK CONFIGURATION
# ----------------------------------------------------------------------
# hostname of server
#TRAEFIK_HOST=pecan-docker.ncsa.illinois.edu
# Run traffik on port 80 (http) and port 443 (https)
#TRAEFIK_HTTP_PORT=80
#TRAEFIK_HTTPS_PORT=443
# Use you real email address here to be notified if cert expires
#TRAEFIK_ACME_EMAIL=pecanproj@gmail.com
# ----------------------------------------------------------------------
# PEcAn CONFIGURATION
# ----------------------------------------------------------------------
# what version of pecan to use
#PECAN_VERSION=develop
# the fully qualified hostname used for this server
#PECAN_FQDN=pecan-docker.ncsa.illinois.edu
# short name shown in the menu
#PECAN_NAME=pecan-docker
# ----------------------------------------------------------------------
# BETY CONFIGURATION
# ----------------------------------------------------------------------
# what version of BETY to use
#BETY_VERSION=develop
# what is our server number, 99=vm, 98=docker
#BETY_LOCAL_SERVER=98
# secret used to encrypt cookies in BETY
#BETY_SECRET_KEY=1208q7493e8wfhdsohfo9ewhrfiouaho908ruq30oiewfdjspadosuf08q345uwrasdy98t7q243
# ----------------------------------------------------------------------
# MINIO CONFIGURATION
# ----------------------------------------------------------------------
# minio username and password
#MINIO_ACCESS_KEY=carya
#MINIO_SECRET_KEY=illinois
# ----------------------------------------------------------------------
# PORTAINER CONFIGURATION
# ----------------------------------------------------------------------
# password for portainer admin account
# use docker run --rm httpd:2.4-alpine htpasswd -nbB admin <password> | cut -d ":" -f 2
#PORTAINER_PASSWORD=$2y$05$5meDPBtS3NNxyGhBpYceVOxmFhiiC3uY5KEy2m0YRbWghhBr2EVn2
# ----------------------------------------------------------------------
# RABBITMQ CONFIGURATION
# ----------------------------------------------------------------------
# RabbitMQ username and password
#RABBITMQ_DEFAULT_USER=carya
#RABBITMQ_DEFAULT_PASS=illinois
# create the correct URI with above username and password
#RABBITMQ_URI=amqp://carya:illinois@rabbitmq/%2F
# ----------------------------------------------------------------------
# RSTUDIO CONFIGURATION
# ----------------------------------------------------------------------
# Default RStudio username and password for startup of container
#PECAN_RSTUDIO_USER=carya
#PECAN_RSTUDIO_PASS=illinois
You can also extend the docker-compose.yml
file with a docker-compose.override.yml
file (in the same directory), allowing you to add more services, or for example to change where the volumes are stored (see official documentation). For example the following will change the volume for postgres to be stored in your home directory:
version: "3"
volumes:
postgres:
driver_opts:
type: none
device: ${HOME}/postgres
o: bind
25.4.3 Top-level structure
The root of the docker-compose.yml
file contains three sections:
services
– This is a list of services provided by the application, with each service corresponding to a container. When communicating with each other internally, the hostnames of containers correspond to their names in this section. For instance, regardless of the “project” name passed todocker-compose up
, the hostname for connecting to the PostgreSQL database of any given container is always going to bepostgres
(e.g. you should be able to access the PostgreSQL database by calling the following from inside the container:psql -d bety -U bety -h postgres
). The services comprising the PEcAn application are described below.networks
– This is a list of networks used by the application. Containers can only communicate with each other (via ports and hostnames) if they are on the same Docker network, and containers on different networks can only communicate through ports exposed by the host machine. We just provide the network name (pecan
) and resort to Docker’s default network configuration. Note that the services we want connected to this network include anetworks: ... - pecan
tag. For more details on Docker networks, see the official documentation.volumes
– Similarly tonetworks
, this just contains a list of volume names we want. Briefly, in Docker, volumes are directories containing files that are meant to be shared across containers. Each volume corresponds to a directory, which can be mounted at a specific location by different containers. For example, syntax likevolumes: ... - pecan:/data
in a service definition means to mount thepecan
“volume” (including its contents) in the/data
directory of that container. Volumes also allow data to persist on containers between restarts, as normally, any data created by a container during its execution is lost when the container is re-launched. For example, using a volume for the database allows data to be saved between different runs of the database container. Without volumes, we would start with a blank database every time we restart the containers. For more details on Docker volumes, see the official documentation. Here, we define three volumes:postgres
– This contains the data files underlying the PEcAn PostgreSQL database (BETY). Notice that it is mounted by thepostgres
container to/var/lib/postgresql/data
. This is the data that we pre-populate when we run the Docker commands to initialize the PEcAn database. Note that these are the values stored directly in the PostgreSQL database. The default files to which the database points (i.e.dbfiles
) are stored in thepecan
volume, described below.rabbitmq
– This volume contains persistent data for RabbitMQ. It is only used by therabbitmq
service.pecan
– This volume contains PEcAn’sdbfiles
, which include downloaded and converted model inputs, processed configuration files, and outputs. It is used by almost all of the services in the PEcAn stack, and is typically mounted to/data
.
25.4.4 traefik
Traefik manages communication among the different PEcAn services and between PEcAn and the web.
Among other things, traefik
facilitates the setup of web access to each PEcAn service via common and easy-to-remember URLs.
For instance, the following lines in the web
service configure access to the PEcAn web interface via the URL http://pecan.localhost/pecan/ :
~
(Further details in the works…)
The traefik service configuration looks like this:
traefik:
hostname: traefik
image: traefik:v2.9
command:
- --log.level=INFO
- --api=true
- --api.dashboard=true
- --entrypoints.web.address=:80
- --providers.docker=true
- --providers.docker.endpoint=unix:///var/run/docker.sock
- --providers.docker.exposedbydefault=false
- --providers.docker.watch=true
restart: unless-stopped
networks: pecan
security_opt: no-new-privileges:true
ports: ${TRAEFIK_HTTP_PORT-80}:80
volumes:
- traefik:/config
- /var/run/docker.sock:/var/run/docker.sock:ro
labels:
- traefik.enable=true
- traefik.http.routers.traefik.entrypoints=web
- traefik.http.routers.traefik.rule=Host(`traefik.pecan.localhost`)
- traefik.http.routers.traefik.service=api@internal
25.4.5 portainer
portainer is lightweight management UI that allows you to manage the docker host (or swarm). You can use this service to monitor the different containers, see the logfiles, and start and stop containers.
The portainer service configuration looks like this:
NA: ~
Portainer is accessible by browsing to pecan.localhost/portainer/
. You can either set the password in the .env
file (for an example see env.example) or you can use the web browser and go to the portainer url. If this is the first time it will ask for your password.
25.4.6 minio
Minio is a service that provides access to the a folder on disk through a variety of protocols, including S3 buckets and web-based access. We mainly use Minio to facilitate access to PEcAn data using a web browser without the need for CLI tools.
Our current configuration is as follows:
NA: ~
The Minio interface is accessible by browsing to minio.pecan.localhost
.
From there, you can browse directories and download files.
You can also upload files by clicking the red “+” in the bottom-right corner.
Note that it is currently impossible to create or upload directories using the Minio interface (except in the /data
root directory – those folders are called “buckets” in Minio).
Therefore, the recommended way to perform any file management tasks other than individual file uploads is through the command line, e.g.
docker run -it --rm --volumes pecan_pecan:/data --volumes /path/to/local/directory:/localdir ubuntu
# Now, you can move files between `/data` and `/localdir`, create new directories, etc.
25.4.7 thredds
This service allows PEcAn model outputs to be accessible via the THREDDS data server (TDS). When the PEcAn stack is running, the catalog can be explored in a web browser at http://pecan.localhost/thredds/catalog.html. Specific output files can also be accessed from the command line via commands like the following:
Note that everything after outputs/
exactly matches the directory structure of the workflows
directory.
Which files are served, which subsetting services are available, and other aspects of the data server’s behavior are configured in the docker/thredds_catalog.xml
file.
Specifically, this XML tells the data server to use the datasetScan
tool to serve all files within the /data/workflows
directory, with the additional filter
that only files ending in .nc
are served.
For additional information about the syntax of this file, see the extensive THREDDS documentation.
Our current configuration is as follows:
NA: ~
25.4.8 postgres
This service provides a working PostGIS database. Our configuration is fairly straightforward:
postgres:
hostname: postgres
image: mdillon/postgis:9.5
restart: unless-stopped
networks: pecan
volumes: postgres:/var/lib/postgresql/data
healthcheck:
test: pg_isready -U postgres
interval: 10s
timeout: 5s
retries: 5
Some additional details about our configuration:
image
– This pulls a container with PostgreSQL + PostGIS pre-installed. Note that by default, we use PostgreSQL version 9.5. To experiment with other versions, you can change9.5
accordingly.networks
– This allows PostgreSQL to communicate with other containers on thepecan
network. As mentioned above, the hostname of this service is just its name, i.e.postgres
, so to connect to the database from inside a running container, use a command like the following:psql -d bety -U bety -h postgres
volumes
– Note that the PostgreSQL data files (which store the values in the SQL database) are stored on a volume calledpostgres
(which is not the same as thepostgres
service, even though they share the same name).
25.4.9 rabbitmq
RabbitMQ is a message broker service. In PEcAn, RabbitMQ functions as a task manager and scheduler, coordinating the execution of different tasks (such as running models and analyzing results) associated with the PEcAn workflow.
Our configuration is as follows:
rabbitmq:
hostname: rabbitmq
image: rabbitmq:3.8-management
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_DEFAULT_USER=${RABBITMQ_DEFAULT_USER:-guest}
- RABBITMQ_DEFAULT_PASS=${RABBITMQ_DEFAULT_PASS:-guest}
labels:
- traefik.enable=true
- traefik.http.services.rabbitmq.loadbalancer.server.port=15672
- traefik.http.routers.rabbitmq.entrypoints=web
- traefik.http.routers.rabbitmq.rule=Host(`rabbitmq.pecan.localhost`)
volumes: rabbitmq:/var/lib/rabbitmq
healthcheck:
test: rabbitmqctl ping
interval: 10s
timeout: 5s
retries: 5
Note that the traefik.http.routers.rabbitmq.rule
indicates that browsing to http://rabbitmq.pecan.localhost/ leads to the RabbitMQ management console.
By default, the RabbitMQ management console has username/password guest/guest
, which is highly insecure.
For production instances of PEcAn, we highly recommend changing these credentials to something more secure, and removing access to the RabbitMQ management console via Traefik.
25.4.10 bety
This service operates the BETY web interface, which is effectively a web-based front-end to the PostgreSQL database.
Unlike the postgres
service, which contains all the data needed to run PEcAn models, this service is not essential to the PEcAn workflow.
However, note that certain features of the PEcAn web interface do link to the BETY web interface and will not work if this container is not running.
Our configuration is as follows:
bety:
hostname: bety
image: pecan/bety:${BETY_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- UNICORN_WORKER_PROCESSES=1
- SECRET_KEY_BASE=${BETY_SECRET_KEY:-notasecret}
- RAILS_RELATIVE_URL_ROOT=/bety
- LOCAL_SERVER=${BETY_LOCAL_SERVER:-99}
volumes: bety:/home/bety/log
depends_on:
postgres:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.services.bety.loadbalancer.server.port=8000
- traefik.http.routers.bety.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/bety/`)
healthcheck:
test: curl --silent --fail http://localhost:8000/$${RAILS_RELATIVE_URL_ROOT} >
/dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
The BETY container Dockerfile is located in the root directory of the BETY GitHub repository (direct link).
25.4.11 docs
This service will show the documentation for the version of PEcAn running as well as a homepage with links to all relevant endpoints. You can access this at http://pecan.localhost/. You can find the documentation for PEcAn at http://pecan.localhost/docs/pecan/.
Our current configuration is as follows:
docs:
hostname: docs
image: pecan/docs:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
labels:
- traefik.enable=true
- traefik.http.services.docs.loadbalancer.server.port=80
- traefik.http.routers.docs.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) && PathPrefix(`/`)
healthcheck:
test: curl --silent --fail http://localhost/ > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
25.4.12 web
This service runs the PEcAn web interface. It is effectively a thin wrapper around a standard Apache web server container from Docker Hub that installs some additional dependencies and copies over the necessary files from the PEcAn source code.
Our configuration is as follows:
NA: ~
Its Dockerfile ships with the PEcAn source code, in docker/web/Dockerfile
.
In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the web interface (i.e. PHP, HTML, and JavaScript code located in the PEcAn web
directory).
25.4.13 executor
This service is in charge of running the R code underlying the core PEcAn workflow. However, it is not in charge of executing the models themselves – model binaries are located on their own dedicated Docker containers, and model execution is coordinated by RabbitMQ.
Our configuration is as follows:
executor:
hostname: executor
user: ${UID:-1001}:${GID:-1001}
image: pecan/executor:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- RABBITMQ_PREFIX=/
- RABBITMQ_PORT=15672
- FQDN=${PECAN_FQDN:-docker}
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
volumes: pecan:/data
Its Dockerfile is ships with the PEcAn source code, in docker/executor/Dockerfile
.
Its image is built on top of the pecan/base
image (docker/base/Dockerfile
), which contains the actual PEcAn source.
To facilitate caching, the pecan/base
image is itself built on top of the pecan/depends
image (docker/depends/Dockerfile
), a large image that contains an R installation and PEcAn’s many system and R package dependencies (which usually take ~30 minutes or longer to install from scratch).
In terms of actively developing PEcAn using Docker, this is the service to modify when making changes to the PEcAn R source code.
Note that, unlike changes to the web
image’s PHP code, changes to the R source code do not immediately propagate to the PEcAn container; instead, you have to re-compile the code by running make
inside the container.
25.4.14 monitor
This service will show all models that are currently running http://pecan.localhost/monitor/. This list returned is JSON and shows all models (grouped by type and version) that are currently running, or where seen in the past. This list will also contain a list of all current active containers, as well as how many jobs are waiting to be processed.
This service is also responsible for registering any new models with PEcAn so users can select it and execute the model from the web interface.
Our current configuration is as follows:
monitor:
hostname: monitor
user: ${UID:-1001}:${GID:-1001}
image: pecan/monitor:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment:
- RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
- FQDN=${PECAN_FQDN:-docker}
depends_on:
postgres:
condition: service_healthy
rabbitmq:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.routers.monitor.rule=Host(`${TRAEFIK_HOST:-pecan.localhost}`) &&
PathPrefix(`/monitor/`)
- traefik.http.routers.monitor.middlewares=monitor-stripprefix
- traefik.http.middlewares.monitor-stripprefix.stripprefix.prefixes=/monitor
volumes: pecan:/data
healthcheck:
test: curl --silent --fail http://localhost:9999 > /dev/null || exit 1
interval: 10s
timeout: 5s
retries: 5
25.4.15 Model-specific containers
Additional models are added as additional services. In general, their configuration should be similar to the following configuration for SIPNET, which ships with PEcAn:
sipnet:
hostname: sipnet-git
user: ${UID:-1001}:${GID:-1001}
image: pecan/model-sipnet-git:${PECAN_VERSION:-latest}
restart: unless-stopped
networks: pecan
environment: RABBITMQ_URI=${RABBITMQ_URI:-amqp://guest:guest@rabbitmq/%2F}
depends_on:
rabbitmq:
condition: service_healthy
volumes: pecan:/data
The PEcAn source contains Dockerfiles for ED2 (models/ed/Dockerfile
) and SIPNET (models/sipnet/Dockerfile
) that can serve as references.
For additional tips on constructing a Dockerfile for your model, see Dockerfiles for Models.