Resources and guidelines for NetCDF formatting#
NetCDF conventions and the NPI approach#
It can be confusing to try to get an overview of what the various conventions and best practices for NetCDF formatting actually are.
Here, we collect various resources for where to find information about the conventions.
The NPI approach most important sources of information are:
The CF conventions (current version 1.10)
Describe conventions for the use metadata; i.e. information about how to interpret the data.
This covers standardised descriptions of the physical meaning and units of data variables, as well as standard names for geographical regions etc.
The ACDD conventions (current version 1.3)
Describe conventions for discovery metadata, i.e. information which makes it easy to find the datasets.
This covers global attributes of the dataset describing the dataset and its history, such as attributes
title,creator_name,time_coverage_start.
The two are compatible; ADCC defines which attributes to include, and the CF conventions provides standardised names for specific important variable attributes.
In addition, the following are useful resources:
UNIDATA NetCDF User’s Guide (NUG), provides official documentation for the netCDF format.
NetCDF Conventions % Physical Oceanography Data management at NPI is a nice documentat written by Yannick Kern for work with netCDF files at NPI. Some of the content may be superseded by the ongoing efforts; we should consider making an updated version.
The OceanSITES Data Format Reference Manual details netCDF standards for OceanSITES datasets (focused on ocean transport mooring arrays) provides guidelines compatible with the above conventions and specific to oceanographic data.
IMOS NETCDF CONVENTIONS (Australia) are also useful.
CF conventions#
The CF (Climate and Forecasting) conventions conventions define the community standards for NetCDF formatting. It is an extension of the older COARDS conventions. It was origainally targeted for large gridded datasets in the climate and weather forecating world, but has expanded to become the standard for general earth science data, including observational data such as the ocean data we collect at NPI.
The
standard_nameattribute associated with a variable is a key part of the CF conventions. It serves to describe precisely the physical quantities being represented. For example Thestandard_nameis stricly controlled by the CF-conventions. For example, in-situ ocean temperature should have the standard namesea_water_temperature.The
standard_nameis a variable attribute, different from the variable name. The CF standard name table defines thestandard_nameand the canonical units.Multiple variables can have the same
standard_name, e.g. if we have different measurements of the same variable or dual sensors. They should be differentiated not bystandard_namebut by other varioable attributes and variable name.The
unitsattribute associated with a variable describes the physical unit of the variable data, such asPa,J m-2, ordegree_north.Dimensionless quantities that represent fractions, or parts of a whole, should have unit
1. This applies, for example, to practical salinity and sea ice fraction.It is fine to use units other than the canonical units, such as
degree_Celsiusinstead ofK. The requirement is that the string describing the unit is supported by the UDUNITS2 package. An overview of units and symbols in UDUNITS2 can be found here.
The
long_nameattribute associated with each variable is optional but recommended, and not standardised by the conventions. It is described as a long descriptive name which may, foexample, be used for labeling plots. In instances where we do not use astandard_name, like in the case of uncalibrated/voltage data, it is very important to include along_namedescribing the physical meaning of the variable.
For example, if sea temperature on a mooring is measured by a series of 5 Microcats and by a profiler that produces values at 10 levels, it may be reported in a single file with OceanSITES data management User’s manual temperature variables and 2 depth variables. TEMP(TIME, DEPTH) could hold the Microcat data, if DEPTH is declared as a 5-element coordinate; and TEMP_prof(TIME, DEPTH_prof) could hold the profiler data if DEPTH_prof is declared as a 10-element coordinate. Both variables would have a standard_name of “sea_water_temperature”. The following lists a subset of the OceanSITES recommended variable names.
Variable names
The variable name itself is not standardised by CF. The CF-documentation explicitly states that “Nothing depends on the names of variables”.
There is one notable exception: Names of standard coordinate variables (TIME, DEPTH, PRES, LONGITUDE, LATITUDE) should follow UNIDATA conventions when possible
Recommendations for variable names (but not strict standardizations!) are given by the SeaDataNet Parameter Discovery Vocabulary P02 (SDN P02). These are also used in the OceanSITES conventions.
At NPI, we will try to adhere to SND P02 names, e.g., PRES, TEMP, PSAL, CNDC. Useful references for this are the parameter names in the OceanSITES manual and ARGO user’s manual (p75).
OceanSITES (3.6, p25) recommends that variable names start with SND P02 names, which can be followed by a suffix (e.g. TEMP_prof). Suggested recommendation (ØL) is to use this to indicate preliminary/uncalibrated data (for example, CHLA_uncal).
ADCC conventions#
The ADCC (Attribute Convention for Data Discovery) conventions define the metadata attributes that should be included in the netCDF file. ADCC operates with Highly recommended, Recommended, and Suggested attributes. We should aim to always use the first two and include Suggested variables whenever possible.
The ADCC website provides a nicely ordered list of all required attributes with an explanation of what each field should contain. Yannick’s document also provides a nice template.
ARGO guidelines#
The ARGO data management users manual details the practices at ARGO.
Other useful attributes not strictly required (“NPI conventions”)#
It is good practice to include an
original_namecontaining the variable name before conversion of the data. E.g., for CTD data, this could be the SBE variable names such ast090C,lECO-AFLetc.
Unidata conventions#
Conventions for units are defined by the UDUNITS package maintained at UCAR.
Other useful attributes#
UNIDATA NetCDF User’s Guide
NetCDF Conventions % Physical Oceanography Data management at NPI: Nice documentation written by Yannick Kern