24.4 Thredds Setup

Installing and configuring Thredds for PEcAn authors - Rob Kooper

NOTE: Instructions are only tested for Ubuntu 16.04 on the VM, if you have instructions for CENTOS/RedHat please update this documentation NOTE: Pretty much every step here requires root access.

24.4.1 Install the Tomcat 8 and Thredds webapp

The Tomcat 8 server can be installed from the default Ubuntu repositories. The thredds webapp will be downloaded and installed from unidata.

First step is to install Tomcat 8 and configure it. The flag -Dtds.content.root.path should point to the location of where the thredds folder is located. This needs to be writeable by the user for tomcat. -Djava.security.egd is a special flag to use a different random number generator for tomcat. The default would take to long to generate a random number.

apt-get -y install tomcat8 openjdk-8-jdk
echo JAVA_OPTS=\"-Dtds.content.root.path=/home/carya \${JAVA_OPTS}\" >> /etc/default/tomcat8
echo JAVA_OPTS=\"-Djava.security.egd=file:/dev/./urandom \${JAVA_OPTS}\" >> /etc/default/tomcat8
service tomcat8 restart

Next is to install the webapp.

mkdir /home/carya/thredds
chmod 777 /home/carya/thredds

wget -O /var/lib/tomcat8/webapps/thredds.war ftp://ftp.unidata.ucar.edu/pub/thredds/4.6/current/thredds.war

Finally we configure Apache to prox the thredds server

cat > /etc/apache2/conf-available/thredds.conf << EOF
ProxyPass        /thredds/ http://localhost:8080/thredds/
ProxyPassReverse /thredds/ http://localhost:8080/thredds/
RedirectMatch permanent ^/thredds$ /thredds/
EOF
a2enmod proxy_http
a2enconf thredds
service apache2 reload

24.4.1.1 Customize the Thredds server

To customize the thredds server for your installation edit the file in /home/carya/thredds/threddsConfig.xml. For example the following file is included in the VM.

<?xml version="1.0" encoding="UTF-8"?>
<threddsConfig>

  <!-- all options are commented out in standard install - meaning use default values -->
  <!-- see http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html -->
  <serverInformation>
    <name>PEcAn</name>
    <logoUrl>/pecan/images/pecan_small.jpg</logoUrl>
    <logoAltText>PEcAn</logoAltText>

    <abstract>Scientific Data</abstract>
    <keywords>meteorology, atmosphere, climate, ocean, earth science</keywords>
    
    <contact>
      <name>Rob Kooper</name>
      <organization>NCSA</organization>
      <email>kooper@illinois.edu</email>
      <!--phone></phone-->
    </contact>
    <hostInstitution>
      <name>PEcAn</name>
      <webSite>http://www.pecanproject.org/</webSite>
      <logoUrl>/pecan/images/pecan_small.jpg</logoUrl>
      <logoAltText>PEcAn Project</logoAltText>
    </hostInstitution>
  </serverInformation>

  <!--
  The <catalogRoot> element:
  For catalogs you don't want visible from the /thredds/catalog.xml chain
  of catalogs, you can use catalogRoot elements. Each catalog root config
  catalog is crawled and used in configuring the TDS.

  <catalogRoot>myExtraCatalog.xml</catalogRoot>
  <catalogRoot>myOtherExtraCatalog.xml</catalogRoot>
  -->

  <!--
   * Setup for generated HTML pages.
   *
   * NOTE: URLs may be absolute or relative, relative URLs must be relative
   * to the webapp URL, i.e., http://server:port/thredds/.
    -->
  <htmlSetup>
    <!--
     * CSS documents used in generated HTML pages.
     * The CSS document given in the "catalogCssUrl" element is used for all pages
     * that are HTML catalog views. The CSS document given in the "standardCssUrl"
     * element is used in all other generated HTML pages.
     * -->
    <standardCssUrl>tds.css</standardCssUrl>
    <catalogCssUrl>tdsCat.css</catalogCssUrl>
    <openDapCssUrl>tdsDap.css</openDapCssUrl>

    <!--
     * The Google Analytics Tracking code you would like to use for the
     * webpages associated with THREDDS. This will not track WMS or DAP
     * requests for data, only browsing the catalog.
    -->
    <googleTrackingCode></googleTrackingCode>

  </htmlSetup>
  
  <!-- 
    The <TdsUpdateConfig> element controls if and how the TDS checks
    for updates. The default is for the TDS to check for the current
    stable and development release versions, and to log that information
    in the TDS serverStartup.log file as INFO entries.

  <TdsUpdateConfig>
     <logVersionInfo>true</logVersionInfo>
  </TdsUpdateConfig>
  -->
   
  <!--
   The <CORS> element controls Cross-Origin Resource Sharing (CORS).
   CORS is a way to allow a website (such as THREDDS) to open up access
   to resources to web pages and applications running on a different domain.
   One example would be allowing a web-application to use fonts from
   a separate host. For TDS, this can allow a javascript app running on a
   different site to access data on a THREDDS server.
   For more information see: https://en.wikipedia.org/wiki/Cross-origin_resource_sharing
   The elements below represent defaults. Only the <enabled> tag is required
   to enable CORS. The default allowed origin is '*', which allows sharing
   to any domain.
  <CORS>
    <enabled>false</enabled>
    <maxAge>1728000</maxAge>
    <allowedMethods>GET</allowedMethods>
    <allowedHeaders>Authorization</allowedHeaders>
    <allowedOrigin>*</allowedOrigin>
  </CORS>
  -->

  <!--
   The <CatalogServices> element:
   - Services on local TDS served catalogs are always on.
   - Services on remote catalogs are set with the allowRemote element
   below. They are off by default (recommended).
   -->
  <CatalogServices>
    <allowRemote>false</allowRemote>
  </CatalogServices>

  <!--
  Configuring the CDM (netcdf-java library)
  see http://www.unidata.ucar.edu/software/netcdf-java/reference/RuntimeLoading.html

  <nj22Config>
    <ioServiceProvider class="edu.univ.ny.stuff.FooFiles"/>
    <coordSysBuilder convention="foo" class="test.Foo"/>
    <coordTransBuilder name="atmos_ln_sigma_coordinates" type="vertical" class="my.stuff.atmosSigmaLog"/>
    <typedDatasetFactory datatype="Point" class="gov.noaa.obscure.file.Flabulate"/>
  </nj22Config>
  -->

  <!--
  CDM uses the DiskCache directory to store temporary files, like uncompressed files.
  <DiskCache>
    <alwaysUse>false</alwaysUse>
    <scour>1 hour</scour>
    <maxSize>1 Gb</maxSize>
  </DiskCache>
  -->

  <!--
  Caching open NetcdfFile objects.
  default is to allow 50 - 100 open files, cleanup every 11 minutes
  <NetcdfFileCache>
    <minFiles>50</minFiles>
    <maxFiles>100</maxFiles>
    <scour>11 min</scour>
  </NetcdfFileCache>
  -->

  <!--
  The <HTTPFileCache> element:
  allow 10 - 20 open datasets, cleanup every 17 minutes
  used by HTTP Range requests.
  <HTTPFileCache>
    <minFiles>10</minFiles>
    <maxFiles>20</maxFiles>
    <scour>17 min</scour>
  </HTTPFileCache>
  -->

  <!--
  Writing GRIB indexes.
  <GribIndexing>
    <setExtendIndex>false</setExtendIndex>
    <alwaysUseCache>false</alwaysUseCache>
  </GribIndexing>
  -->

  <!--
  Persist joinNew aggregations to named directory. scour every 24 hours, delete stuff older than 90 days
  <AggregationCache>
    <scour>24 hours</scour>
    <maxAge>90 days</maxAge>
    <cachePathPolicy>NestedDirectory</cachePathPolicy>
  </AggregationCache>
  -->

  <!--
  How to choose the template dataset for an aggregation. latest, random, or penultimate
  <Aggregation>
    <typicalDataset>penultimate</typicalDataset>
  </Aggregation>
  -->

  <!--
  The Netcdf Subset Service is off by default.
  <NetcdfSubsetService>
    <allow>false</allow>
    <scour>10 min</scour>
    <maxAge>-1 min</maxAge>
  </NetcdfSubsetService>
  -->

  <!--
  <Opendap>
    <ascLimit>50</ascLimit>
    <binLimit>500</binLimit>
    <serverVersion>opendap/3.7</serverVersion>
  </Opendap>
    -->
  
  <!--
  The WCS Service is off by default.
  Also, off by default (and encouraged) is operating on a remote dataset.
  <WCS>
    <allow>false</allow>
    <allowRemote>false</allowRemote>
    <scour>15 min</scour>
    <maxAge>30 min</maxAge>
  </WCS>
  -->

  <!--
  <WMS>
    <allow>false</allow>
    <allowRemote>false</allowRemote>
    <maxImageWidth>2048</maxImageWidth>
    <maxImageHeight>2048</maxImageHeight>
  </WMS>
  -->

  <!--
  <NCISO>
    <ncmlAllow>false</ncmlAllow>
    <uddcAllow>false</uddcAllow>
    <isoAllow>false</isoAllow>
  </NCISO>
  -->

  <!-- CatalogGen service is off by default.
  <CatalogGen>
    <allow>false</allow>
  </CatalogGen>
   -->

  <!-- DLwriter service is off by default.
       As is support for operating on remote catalogs.
  <DLwriter>
    <allow>false</allow>
    <allowRemote>false</allowRemote>
  </DLwriter>
   -->

  <!-- DqcService is off by default.
  <DqcService>
    <allow>false</allow>
  </DqcService>
   -->

  <!--
   Link to a Viewer application on the HTML page:
   <Viewer>my.package.MyViewer</Viewer>
   -->

   <!--
   Add a DataSource - essentially an IOSP with access to Servlet request parameters
   <datasetSource>my.package.DatsetSourceImpl</datasetSource>
   -->

  <!--
   set FeatureCollection logging
  <FeatureCollection>
     <RollingFileAppender>
       <MaxFileSize>1 MB</MaxFileSize>
       <MaxBackups>5</MaxBackups>
       <Level>INFO</Level>
     </RollingFileAppender>
  </FeatureCollection>
  -->

  <!--
    Configure how the NetCDF-4 C library is discovered and used.
    libraryPath: The directory in which the native library is installed.
    libraryName: The name of the native library. This will be used to locate the proper .DLL, .SO, or .DYLIB file
      within the libraryPath directory.
    useForReading: By default, the native library is only used for writing NetCDF-4 files; a pure-Java layer is
      responsible for reading them. However, if this property is set to true, then it will be used for reading
      NetCDF-4 (and HDF5) files as well.
  -->
  <!--
  <Netcdf4Clibrary>
    <libraryPath>/usr/local/lib</libraryPath>
    <libraryName>netcdf</libraryName>
    <useForReading>false</useForReading>
  </Netcdf4Clibrary>
  -->
</threddsConfig>

24.4.2 Update the catalog

For example to update the catalog with the latest data, run the following command from the root crontab. This cronjob will also synchronize the database with remote servers and dump your database (by default in /home/carya/dump)

0 * * * * /home/carya/pecan/scripts/cron.sh -o /home/carya/dump

24.4.3 Troubleshooting

Refer to the log files for Tomcat (/var/log/tomcat8/*) and Thredds (/home/carya/thredds/logs).

24.4.4 Further reading