View on GitHub

PEcAn

Home News People Tutorials Demo Download Documentation Package Documentation Workshop GSoC Contact Us

GSoC - PEcAn Project Ideas

Ecosystem science has many components, so does PEcAn! Some of those components where you can contribute includes but not limited to:

  • Meta-Data Upload Interface – For trait ‘metadata’ tables, enable users to upload R data tables and csv files to sites, citations, variables, methods, treatments, experiments, tables in BETY the PostgresSQL database.
  • Data Ingestion SHINY App – Improve upon the our existing data ingestion user interface. Implment machine learning algorithm to intelligently suggest variable matches to new data and extend the automated download to new
  • Extend API – We've developed the base funcaitonality to use PEcAn as an API. We need your help to extend it's functionallity for production and distribution as well as solidify documentation.
  • Admin Dashboard – Allow users of PEcAn to better set up new nodes and administrate over their PEcAn ecosystem through the web interface.
  • Visualization – Develop SHINY app to visualize ouputs and analysis of ensemble runs.
  • Containerization – Solidify PEcAn's container architecture
  • Extend Analysis - Develop SHINY app to allow users to easily implement PEcAn analysis to existing workflows.
  • Add Remote Data – Expand Remote Sensing and GIS tools in PEcAn
  • Meta-Data Upload Interface

    For trait ‘metadata’ tables … enable users to upload R data tables and csv files to sites, citations, variables, methods, treatments, experiments, tables There are three steps: 1. adding API endpoints 2. writing R functions that use them 3. then creating a shiny application.

    Expected outcome:Add ‘post’ endpoints to API for metadata tables (sites, citations, variables, methods, treatments, experiments, ...). Write functions to take an appropriately formatted data table, google sheet, or csv and insert it into BETYdb. Create a Shiny application that will walk users through this process of uploading these tables.

    Prerequisites: R is required, experience or willingness to learn PostgreSQL programming and R SHINY.

    Contact person: David LeBauer, @dlebauer and Kristina Riemer,@Kristina Riemer

    Data Ingestion Shiny App

    Ecosystem modeling relies heavily on fusing data from multiple sources. Whether it be data to calibrate a model or benchmark a model result, they come from different sources that are varying in their formats and naming conventions. The difference in semantics creates a bottleneck as a central ontology does not exist to translate and relate the measurements from different sites, experiments, and/or databases. To alleviate this issue, this project’s goal is to improve upon the existing SHINY data ingestion app. Two main tasks will be to refine the existing interface and then to add a Machine Learning component to eas the process of matching ormats and variables as new data is added.

    Expected outcome: A SHINY app that facilitates easy data ingestion and learns to suggest existing variable mathces in the database to the data that is being uploaded.

    Prerequisites:Proficiency in R. Interest in data provenance, SQL (PostgreSQL preferred), and SHINY

    Contact person: Tony Gardella @tonygard

    Extend API

    Extend the PEcAn API package to full functionality. Flush out support for file I/O and transfer components using PEcAn THREDDS and redesign functionality to not rely on database connections.

    Expected outcome: Easy to use API package allowing users to

    Prerequisites: Knowledge of R

    Contact person: Make interest known on Slack and we will find a match you with a mentor

    Admin Dashboard

    PEcAn exists as a distributed set of machines, but can take great expertise to handle and setting up and maintaining. To ease this process, this project entials building upon the existing dashboard so thatits becomes easier to add a node to the network and modify mahcine settings.

    Expected outcome: Easy to use web interface to allow user to change config.php and machine settings of PEcAn

    Prerequisites: Experience with R and PHP

    Contact person: Rob Kooper, @kooper

    Scientific Visualization

    Our mission is to create an ecosystem modeling toolbox that is accessible to a non-technical audience (e.g., a high school ecology classroom) while retaining sufficient power and versatility to be valuable to scientific programmers (e.g. ecosystem model developers). However, the diversity of ecosystem models and associated analyses supported by PEcAn poses logistical challenges for presentation of results, especially given the wide range of targeted users. Web-based interactive visualizations can be a powerful tool for exploring model outputs and data as well as a fun learning tool in educational environments.

    Currently, PEcAn has basic support for interactive visualizations of outputs using R Shiny. We are looking for a student interested in addressing any of the following areas:

  • Improving Shiny application stability and performance, for instance through more efficient caching or lazy-loading of large outputs and data, or leveraging more efficient interactive visualization frameworks.
  • Enhancing the visual elements of our interface for starting model runs, including visualization of existing sites and input data and better UI elements for setting run options.
  • Developing novel interactive visualization tools that leverage more advanced statistical techniques, such as visualizing and applying machine learning algorithms to outputs and model-data residuals, exploring results in multivariate space.
  • Expected outcome: A more robust set of web-based interactive visualization tools for model simulations and user-provided data.

    Prerequisites: Familiarity with R Shiny including the ability to work with and debug these tools in a remote, Unix-based CLI environment is a requirement. Preference for proficiency with SQL (especially PostgreSQL), HDF/NetCDF formats, and/or advanced statistics (e.g. multivariate regression, time series analysis, information theory) is preferred. Experience with other web-based interactive visualization frameworks, such as Javascript’s D3, is a plus.

    Contact person: Betsy Cowdery, @bcow and Hamze Dokoohaki @Hamze

    Containers

    There are a number of project related to containers and PEcAn:

  • PEcAn and Kubernetes - Allow for ability to start a PEcAn instance on Google and/or Amazon and/or local Kubernetes cluster.
  • Docker Models - Have only the model run in one container, but have the job setup, pre-model run and post-model run done in an additional container. This would help make the model containers smaller and not require and R code. This would also give the ability to not need to add the model setup, job configuration to the default docker image.
  • Secure Data - Securely fetch all the data from a central service, place the data next to the model and after finish execution have the ability to push the data back to the central service. Best if there is some caching capabilities, where files might not be deleted locally (assuming there is space), this would allow us to re-use the same met file without having to download it multiple times in case of an ensable run.
  • Singularity - Use singularity to run models and convert model docker images to singularity images. Launch from a docker web image a qsub on a HPC to run the singularity images on the hPC to do the actual model runs. Add the ability to leverage of HTcondor to run the models in a condor environment.
  • Prerequisites: R, Docker,SHINY, experience with git is a plus.

    Contact person: Rob Kooper, @kooper

    Extend Analysis

    PEcAn offers multiple analyses on top of a simple execution of an ecosystem model. Currently, you must write a custom script or start a run again from scratch if you would like to perform one of these analyses on an existing model run. To alleviate this problem, this project will entail creating a SHINY app that will facilitate the process of taking an existing model run and initating analyses on that existing run.

    Expected outcome: A SHINY app that walks a user through selecting an existing workflow allowing a user to select from a set of analyses they can apply to that workflow.

    Prerequisites: Experience with R required and knowledge of SHINY is preferred

    Contact person: Tony Gardella @tonygard

    Add Remote Data

    PEcAn offers the ability to ingest multiple streams of data automatically into models. We are currently lacking automated ingestion of Remote sensing and large spatial data. This project will develop and improve upon PEcAn's ability to ingest these types of data.

    Expected outcome: A SHINY app and/or set of functions that automates the process of ingesting remote sensing data into the PEcAn workflow.

    Prerequisites: Experience with R required

    Contact person: Shawn Serbin and Bailey Morrison, @Bailey BNL