Quality indicators processing_level and QC_indicator#
Data quality descriptors that are a common source of confusion. Below is an attempt to suggest a standard NPIOcean usage.
Caution
This contains some draft recommendations quickly put together by ØF. They are suggestions for a future NPIOcean recommendation - not agreed-upon NPIOcean guidelines (to be discussed further)..
processing_level#
Description#
ADCC Recommended attribute described as:
“A textual description of the processing (or quality control) level of the data.”
Can be assigned as either:
A global variable (if one description fits the entire dataset), or
A variable attribute for each data variable (if processing/QC differs between variables)
OceanSITES#
OceanSITES prescribes set values for processing_level (page 24, OceanSITES manual), like Raw instrument data, Post-recovery calibrations have been applied, Data interpolated..
Full OceanSITES option list for processing_level
Raw instrument data
Instrument data that has been converted to geophysical values
Post-recovery calibrations have been applied
Data has been scaled using contextual information
Known bad data has been replaced with null values
Known bad data has been replaced with values based on surroundingdata
Ranges applied, bad data flagged
Data interpolated
Data manually reviewed
Data verified against model or other contextual information
Other QC process applied
In practice, it can be hard to pick one of these that is a good fit for the processing/quality level of the data. In particular, the descriptions are a bit too broad to be very useful, and in most cases more than one of the catagories apply.
Since there is actually no requirements of the use of set values in the conventions (in fact, “a textual description” rather suggests this should be written case-by-case), we may want to go away from using the OceanSITES values.
NPIOcean recommendation (tentative)#
Suggested recommendation for use of processing_level
Write, in your own words, a brief summary (about 1-3 sentences) of the processing level.
Examples:
Salinity data have been corrected against in-situ water samples. A small number of outliers removed after manual review.Converted to physical values only. No subsequent quality control applied.A scale factor 1.03 has been applied to oxygen based on climatological mean for the area. Missing values have been filled using linear interpolation.
Take the OceanSITES list as a guide if helpful, but do not feel obliged to use the exact categories.
Think of the user who quickly wants to understand what the data processing level is.
QC_indicator#
Description#
An OceanSITES-specific attribute (i.e. not required by conventions). A quick description (a few words) describing the data quality (good data, probably good data, unknown..).
QC_indicator is not required by convention, and not from a controlled dictionary, so there is technically no need to adhere to a string dictionary. It is a useful attribute to include, however!
NPIOcean recommendation (tentative)#
Suggested recommendation for use of QC_indicator
Use the
QC_indicatorwhen you can.Think of it as a quick description of the data quality (e.g.
good data).Assign on global or variable level depending on what was done for
processing_level:I. e., assign as global variable if
processing_levelis a global attribute.
Preferably use values from the list below.
(This list is based on OceanSITES QC-indicator codes, we have added the category
uncalibrated data).
Suggested option list for QC_indicator
Meaning |
Comment |
|---|---|
|
No QC was performed. |
|
All QC tests passed. |
|
|
|
Not to be used without scientific correction or re-calibration. |
|
Data have failed one or more tests. |
|
Data were not observed but reported (e.g., instrument target depth). |
|
Missing data may be interpolated from neighboring data in space or time. |
|
Fill value - not actual data (e.g. a placeholder for data that will arrive later ). |
|
Uncalibrated data (e.g. raw Chl-A data not calibrated against water samples). |
Think of the user who quickly wants to understand what the data quality is.