Skip to main content

GSoC - PEcAn Project Ideas

PEcAn is an open-source ecosystem modeling framework integrating data, models, and uncertainty quantification. Below is a list of potential ideas where contributors can help improve and expand PEcAn. Come find us on Slack to discuss. If you have questions or would like to propose your own idea, contact @kooper in Slack or join our #gsoc-2025


Project Ideas

Below is a list of project ideas. Feel free to contact the listed mentors on Slack to discuss further or contact @kooper with new ideas and he can help connect you with mentors.


Global sensitivity analysis / uncertainty partitioning

This project would extend PEcAn's existing uncertainty partitioning routines, which are primarily one-at-a-time and focused on model parameters, to also consider ensemble-based uncertainties in other model inputs (meteorology, soils, vegetation, phenology, etc). This project would employ Sobol' methods and some uncommitted code exists that manually prototyped how this would be done in PEcAn. The goal would be to refactor/reimplement this prototype into a reliable, automated system and apply it to some key test cases in both natural and managed ecosystems.

Expected outcomes:

A successful project would complete a subset of the following tasks:

  • Reliable, automated Sobol sensitivity analyss and uncertainty partitioning across multiple model inputs.
  • Applications to test case(s) in natural and / or managed ecosystems.

Prerequisites:

  • Required: R (existing workflow and prototype is in R)
  • Helpful: familiarity with sensitivity analyses

Contact person:

Mike @Dietze

Duration:

Flexible to work as either a Medium (175hr) or Large (350 hr)

Difficulty:

Medium


Parallelization of Model Runs on HPC

This project would extend PEcAn's existing run mechanisms to be able to run on a High Performance Compute cluster (HPC) using Apptainer. For uncertaintity analysis, PEcAn will run the same model 1000s of times with small permutations. This is a perfect use for an HPC run. The goal is to not submit 1000s of jobs, but have a single job with multiple nodes that will run all of the ensembles efficiently. Running can be orchistrated using RabbitMQ, but other methods are also encouraged. The end goal should be for the PEcAn system to be launched, and run the full workflow on the HPC from start to finish leveraging as many nodes as it is given during the submission.

Expected outcomes:

A successful project would complete at subset of the following tasks:

  • Show different ways to launch the jobs (rabbitmq, lock files, simple round robin, etc)
  • Report of different options and how they can be enabled

Prerequisites:

  • Required: R (existing workflow and prototype is in R), Docker
  • Helpful: Familiarity with HPC and Apptainer

Contact person:

Rob @Kooper

Duration:

Flexible to work as either a Medium (175hr) or Large (350 hr)

Difficulty:

Medium


Database and Data Improvements

PEcAn relies the BETYdb database to store trait and yield data as well as model provenance information. This project aims separating trait data from provenance tracking, and ensure that PEcAn is aboe to run without the Postgres server currently required to run BETYdb. The goal is to making the workflows easier to use and data more accessible.

Potential Directions

  • Minimal BETYdb Database: Create a simplified version of BETYdb for demonstrations and Integration tests.
  • Non-Database Setup: Enable workflows that do not require PostgreSQL or a web front-end.

Expected outcomes:

A successful project would complete a subset of the following tasks:

  • A lightweight, distributable demo Postgres database.
  • A Postgres database independent workflow enabling easier local testing and deployment.

Contact person:

Chris Black (@infotroph)

Duration:

Suitable fora Medium (175hr) or Large (350 hr) project.

Difficulty: Medium, Large


Development of Notebook-based PEcAn Workflows

The PEcAn workflow is currently run using either a web based user interface, an API, or custom R scripts. The web based user interface is easiest to use, but has limited functionality whereas the custom R scripts and API are more flexible, but require more experience.

This project will focus on building Quarto workflows aimed at providing an interface to PEcAn that is both welcoming to new users and flexible enough to be a starting point for more advanced users. It will build on existing Pull Request 1733.

Expected outcome:

  • Two or more template workflows for running the PEcAn workflow. Written vignette and video tutorial introducing their use.

Prerequisites:

  • Familiarity with R. Familiarity with R studio and Quarto or Rmarkdown is a plus.

Contact person: David LeBauer @dlebauer, Nihar Sanda @koolgax99

Duration: Medium (175hr)

Difficulty: Medium

Refactoring Compile-time Flags to Runtime Flags in SIPNET

Project Overview

The ecosystem SIPNET is a core component of many PEcAn analyses. SIPNET is compiled with multiple compile-time flags that control whether different features are turned on and off. Thus, as currently configured, each model structure requires a separate compiled binary.

This project will refactor these flags to be runtime-configurable via command-line arguments or a configuration file, improving usability and testing efficiency.

Expected Outcomes

  • Convert selected SIPNET compile-time flags to runtime options.
  • Develop a global configuration object for managing runtime flags.
  • Improve testability by enabling different configurations without recompiling.

Prerequisites

  • Required: C, experience with compilers and build systems.
  • Helpful: Understanding of simulation models.

Mentor(s)

  • David LeBauer (@dlebauer)
  • Mike Longfritz

Duration

  • Medium (175hr) or Large (350hr)

Difficulty

  • Medium to Large