Time-series data files for sensors collecting one-dimensional data are described here (i.e., every time-stamped data record refers to a value at a single point in space). This definition may include instruments on mobile platforms such as Wally (a crawler in Barkley Canyon) or the Vertical Profiling System (VPS), for more information see the mobile device page. However, gridded data from instruments like multibeam sonar and echosounders are excluded. For data searches defined by location and data source, (e.g. CTD at Folger Deep), new files will not be made if the device contributing the data is modified, (i.e. if the CTD is swapped for another). For data searches defined by instrument type, (e.g. CTD 10600), if the device is moved (e.g. from MTC to Folger Deep), new files will be made.
Four formats of Time Series Scalar Data are currently offered: CSV (comma-separated variables), JSON (javascript object notation), ODV (Ocean Data View) and MAT (matlab). CSV, JSON and ODV are text format files readable by any text editor or text-reading capable software. MAT format files are readable by Mathworks Matlab. A prototype NetCDF format exists but is not yet offered to all users. NetCDF is also available from our ERDDAP server. User requests for additional formats will be seriously considered (we're always happy to help), please contact us.
Oceans 3.0 API filter: dataProductCode=TSSD
See New Features Release Notes for more updates
Applies to JSON format files only.
This data is available in CSV, JSON, ODV and MAT formats, plus a prototype netCDF format that is only accessible for internal users. Content descriptions and example files are provided below. A new file is started whenever an instrument has new coordinates (lat, lon, depth). See the mobile device page to see how these data products handle data from mobile devices.
CSV-formatted data files can be opened with Microsoft Excel, Ocean Data View, R, SPSS, Minitab, Matlab, text editors or any other software capable of reading delimited text files. Each file is limited to a maximum of 1 million records.
Files are divided into sections for metadata and data. All non-data entries are preceded by the pound sign (#). Sub-sections are delimited by dashed-lines. Each section and its contents are described here, and a sample can be found below.
Origin Section
Location Section
Device Section
Data Section
EXAMPLE FILE: CambridgeBay_UnderwaterNetwork_ConductivityTemperatureDepth_20180701T000000Z_20180701T005958Z-NaN_clean.csv
Oceans 3.0 API filter: extension=csv
When the 'Fill data gaps with NaNs (Not A Number)' is selected, the time series scalar data products add lines of NaNs for missing samples or empty resample periods to fill the gap in the data, as described in the data gaps section.
For example, a regular Time Series Scalar CSV file for an instrument with 4 sensors might contain the following for an instrument with a one second sampling period: (QAQC flags are excluded for this example)
20120206T000000.000Z,45.34,0.01,NaN,2543 20120206T000001.001Z,45.53,0.01,NaN,2542 20120206T000006.045Z,46.01,0.01,NaN,2541 |
while the same data for a CSV with NaNs would contain the following
20120206T000000.000Z,45.34,0.01,NaN,2543 20120206T000001.001Z,45.53,0.01,NaN,2542 20120206T000002.001Z,NaN,NaN,NaN,NaN 20120206T000003.001Z,NaN,NaN,NaN,NaN 20120206T000004.001Z,NaN,NaN,NaN,NaN 20120206T000005.001Z,NaN,NaN,NaN,NaN 20120206T000006.045Z,46.01,0.01,NaN,2541 |
In the first example, there was a data gap of four sample periods between the second and third lines. This shows up in the CSV with NaNs as four lines of NaNs. In both cases, NaNs was written into the fourth column. This is because no data was found for that sensor for any of the timestamps.
This data product was created for users that are opening their CSV data within an application that requires the NaN-valued time stamps in order to graph the data properly.
The timestamps for the rows of NaN-values are calculated programmatically as follows:
If there is any clock drift on the instrument, there may be a noticeable jump between the last timestamp in the data-gap and the one immediately following, but it should never be more or less than two sample periods.
JSON
JSON-formatted data files can be opened with many text editors or any other software capable of parsing JSON format. Each file is limited to a maximum of one hundred thousand records.
Files contain two main objects: metadata and sensorData (one object per sensor). metadata contains the same metadata as the CSV file header, whereas sensorData differs from the CSV data section by having additional fields such as sensorName, unitofMeasure and actualSamples. See the CSV documentation above for more information on each field - the same definitions and java-code engine are used to generate both CSV and JSON. JSON files can be downloaded in two different formats as noted in the Data Product Options. The only difference between these two formats is the output of data field in sensorData.
Here is the generalized JSON structure for both OM and ONC format JSON:
"metadata": { "dataSection": { "dataProductOptionGap": string, "dataProductOptionQualityControl": string, "dataQualityAssuranceRemark": string, "dateFrom": string, "dateTo": "string, "minPercentValidData": string, "resampleDescription": string, "resamplePeriod": string, "resampleType": string, "samplePeriodTotal": integer, "samplePeriods": [ { "samplePeriod": integer, "samplePeriodStartTime": string } ], "totalSample": integer, "totalSampleExpected": integer }, "deviceSection": { "deploymentTotal": integer, "deviceCategory": string, "deviceDeployments": [ { "deploymentDate": string, "deviceCode": string, "deviceId": integer, "deviceName": string } ] }, "locationSection": { "depth": double, "latitude": double, "longitude": double, "stationCode": string, "stationName": string }, "originSection": { "citations": string, "creationDate": string, "http": string, "metadataFileName": string "searchId": integer, "source": string } } sensorData": [ { "actualSamples": integer, "sensor": string, "sensorName": string, "unitOfMeasure": string, "data": [// for ONC-JSON (standard object-based) { "sampleTime": string, "value": double, "qaqcFlag": integer }, ... "data": [ // for OM-JSON (Observations & Measurements) { "sampleTime": [array of strings], "value": [array of doubles], "qaqcFlag": [array of integers] }, ] }, ] } |
When resampling is selected, additional fields are added, such as 'counts' for the number of samples in the resample (average, min/max) period as well as fields for the min and max values, times at min/max, flags at min/max.
Oceans 3.0 API filter: extension=json
Ocean Data View (ODV) spreadsheet files are plain text (ASCII) semi-colon delimited files similar to CSV files described above. They can be opened by Ocean Data View, R, SPSS, Minitab, Matlab, Microsoft Excel, text editors or any other software capable of reading delimited text files. Ocean data view can open/import this format without any additional user input and the data is instantly available for visualization. MS Excel users can view the file easily by importing with the delimiter set to a semi-colon. Each file is limited to a maximum of 1 million records (for compatibility with MS Excel).
ODV text files have three main parts: the file header, containing metadata, the data header and the data body. The data body contains the same data as a CSV file with additional metadata columns that Ocean Data Views uses to break up multiple deployments (for instance if instruments are swapped out, Ocean Data View will show this as separate stations). The file header has a fixed number of rows (it will grow horizontally with additional data), making it easy to handle in statistical packages like R. The field definitions provided below are the same as the MAT format data product (below) as both are generated by the matlab scalar data engine.
File Header
The file header contains the same information as the CSV header, with additional fields. All time data listed is in ISO 8601 format with dashes and colons; the time base is UTC or 'zulu' time, which is what the 'Z' following all the time stamps indicates. The header is structured with a comment marker '//' preceding every row, with data contained within XML-like tags that describe the data. For example, the first line of the header will normally be this:
//<Creator>Ocean Networks Canada - University of Victoria</Creator> |
Additional important fields include:
Data Header
The data header is the column headings for the data body. It contains headings like the following example. The top row is the column headings stripped of data format tags, the second row is a description of the headings, the third row is an example row of data.
Type | Cruise | Station | yyyy-mm-ddThh:mm:ss.sss | Latitude [degrees_north] | Longitude [degrees_east] | time_ISO8601 | Seafloor Pressure [decibar] | QV:ARGO:Seafloor Pressure [decibar] |
---|---|---|---|---|---|---|---|---|
Blank | Device Name (ID number) | Site-Device Name (ID number) | Site-Device start time | (for the Site-Device) | (for the Site-Device) | time of data reading - primary variable | data value - dependent variable | ARGO QAQC Flag |
* | NRCan Bottom Pressure Recorder 5 (22790) | Deep_BPR_2011-07 (100150) | 2013-09-09T23:10:08.222 | 48.814 | -125.281 | 2013-09-09T23:10:08.222 | 107.237646131 | 1 |
Here is an example file for one hour of data: CambridgeBay_UnderwaterNetwork_ConductivityTemperatureDepth_20180701T000000Z_20180701T005958Z-NaN_clean_ODV.txt
Resampled ODV txt files will also contain a variable containing the count of the data points that contributed to the resampled value (average and/or min/max). When averaging, the standard deviation is also provided.
Data Body
The data body contains semi-colon delimited data corresponding the column header. Some specific data may be blank if there is nothing to report (the type, cruise, station, yyyy-mm-ddThh:mm:ss.sss, Latitude, Longitude columns are blank except when the device deployment changes (as summarized in the file header). Data gaps are filled with 'NaN' (Not a Number). Time stamps follow the ISO 8601 standard with dashes and colons, as shown above. All times are UTC, not local time.
Instructions for Getting Started with Ocean Data View - How to Open ODV Spreadsheet Text Files
We recommend that users install the latest version of Ocean Data View from the ODV website. (Version 4 or newer is recommended for ARGO flag support). ODV's getting started guide can be found on their documentation page.
For an even faster start, follow these directions: start Ocean Data View, select File->Open, change the type of file to 'Data Files' so that it will find and accept .TXT files. To add more data to an existing collection (such as the one just created by opening a .TXT), select Import->ODV Spreadsheet. As data is added and plots are made, the data collection is automatically saved in an .ODV file and a sub-folder containing additional data. If you exit Ocean Data View, you can resume your data collection by opening that .ODV file. The default view is a world map, which isn't interesting. To add a simple plot, select View->Layout Templates->SCATTER WINDOW. Following these steps with the above example ODV spreadsheet file, you should see this:
In addition, right-click on the plot to change which variables are plotted and access the plots' properties. For multiple device deployments, go to View->Station Selection Criteria and select different cruise labels to see the data from different device deployments. To sort out QAQC flags, right-click on the plot and select Sample Selection Criteria and then highlight the flags to filter the data inclusively.
Oceans 3.0 API filter: extension=txt
MAT files (v7) can be opened using Mathworks Matlab 7.0 or later. These files are limited to a size of ~400 MB (depending on compression within the file, once loaded in MATLAB, the memory footprint will less than 1 GB). The file contains two structures, metadata and data.
data: a structure array (one structure per sensor) containing the following:
^ A single sensor, while as an array (multiple sensors / device-level), it looks like this:
(screen grab cut off for width). The fields in the data structure are defined as:
sensorName: Name of sensor.
sensorCode: Unique string for the sensor.
sensorDescription: Description of sensor.
sensorType: Type of sensor as classified in the ONC data model.
sensorTypeID: ONC ID given to sensor type.
units: Unit of measure for the sensor data.
isEngineeringSensor: boolean (flag) to determine if sensor is an engineering sensor.
sensorDerivation: String describing the source of the sensor data: derived from calibration formula (dmas-derived), calculated on the device (instrument-derived), calculated by an external process (externally-derived), or direct from the instrument.
isMobilePositionSensor: boolean (flag) to determine if sensor is a mobile sensor. Note, this will only be flagged true if this data was added in addition to the requested data. For example, if the user requests a device-level mat product from a GPS device, then the latitude sensor is not flagged. Conversely, if the user requests temperature data from a mobile platform like a ship, then the latitude data from the GPS is added and interpolated to match the time stamps of the temperature sensor. See Positioning and Attitude for Mobile Devices for more information.
deviceID: Unique identifier number for the parent device.
searchDateNumFrom: Start date of the specific search in MATLAB datenum format - searches are truncated by availability and deployment dates.
searchDateNumTo: End date of the specific search MATLAB datenum format - searches are truncated by availability and deployment dates.
samplePeriod: Vector of sample periods in seconds (specific to this sensor, maybe different from other sensors and the device-level sample period).
samplePeriodDateFrom: Vector of the start date of each sample period/size (MATLAB datenum format).
samplePeriodDateTo: Vector of the end date of each of sample period/size (MATLAB datenum format).
sampleSize: The size of the data sample (specific to this sensor, maybe different from other sensors and the device-level sample size).
resampleType: Type of resampling used.
resampleDescription: Description of the resampleing used.
resamplePeriod_sec: Resample period in seconds.
resampleTypeID: Unique identifier of the subsample type used: 0/NaN - none, 1 - average, 2 - decimated (not offered), 3 - min/max, 4 - linear interpolation (VPS pressure only).
dataProductOptions: A string describing the data product options selected for this data product. This information is reflected in the file name.
qaqcFlagDescription: A string describing the flags. See the QAQC page for more information.
time: A vector of data timestamps in MATLAB datenum format.
dat: A vector of sensor values corresponding to each timestamp. When resampling by averaging, this becomes the average value drawn from the resampling time window, also known as "box-car average" as documented here: Time Averaging and Resampling. (May make a separate field for this in the future, especially if users prefer that option).
qaqcFlags: A vector indicating the quality of the data, matching the time and dat vectors. See the QAQC page for more information. Note that this is the final, compiled/combined flag, the result of multiple automated tests and manual flags.
Available when not resampling:
dataDateNumFrom: First time-stamp of the time series.
dataDateNumTo: Last time-stamp of the time series.
samplesExpected: The number of valid samples expected from the minimum returned data to the maximum returned data, accounting for variations in sample period.
samplesReceived: The number of raw samples received, maybe less than length(data.time) when data gaps are being filled with the NaN option.
Calibration: Structure (not pictured) containing information on the calibration formula applied to the data, as a it appears on the sensor listing page, in the JEP language. Fields include: dateFrom, dateTo, sensorID, name, formula.
QAQC: Structure containing all of the QAQC data for the time range of the search for the sensor. Each test and sensor combination will be a instance of this structure. Fields include: qaqcID, sensorID, dateFrom, dateTo, qaqcFlag, sourceSensorID, testLevel, priority, description. These field names are the same as found on the QAQC test definitions page: https://data.oceannetworks.ca/QaqcAutotestsFinder with additional fields priority and sourceSensorID, where priority is the priority level for flag combination that is defined by testLevel, while sourceSensorID is the sensorID on which the test was originally applied on. When sourceSensorID and sensorID are different, the test was inherited via a calibration formula; salinity is common example where it will inherit tests on conductivity and temperature. The test formula and attribute values are included as well (where applicable, manual flags do not have formulae), primarily for the benefit of internal users. The formulas are somewhat difficult to parse, Contact Support if you have questions. As a primer, x is the value under test, "$" indicates an internal variable that is available within the system (often mobile position sensor values like "$heading"), while the letter "A" followed by a three digit number, such as "A629", is an attribute contained in the TestAttributes structure where the three digit number is the attributeID. To see all the details of the test, such as the test attribute thresholds, use the aforementioned QaqcAutotestsFinder or go directly to the test definition page by using this URL: https://data.oceannetworks.ca/QaqcAutotestDetails?qaqcId=1, replacing the "1" with the QAQC.qaqcID of the test you wish to investigate. For internal users, please note that the qaqcFlag value is updated from it's value in the database (pass or fail) to it's nominal flag values (0,1,2,3,4) as defined in Quality Assurance Quality Control. For each time stamp (data.time), the final QAQC value (data.qaqcFlags) is determined by combining all of the flags that are applicable at that time. Averaging / resampling data further combines flags within the resample period. This process is documented here: Quality Assurance Quality Control. The final data.qaqcFlags value is then a somewhat complicated combination of tests and flags (essentially the worst flag wins trumped by the manual flag). The QAQC structure is provided to support users investigating final flag values, allowing one to trace backwards to the contributing tests. When not resampling, see data.qaqcFlagsqaqcID to see the qaqcID of the test that contributed the final flag values. Additionally, to see all the tests and test parameters that contributed to the final flag, pass the data.time value to the following blurb of code. It returns a subset of the QAQC structure that contains the tests that were applicable at the target time:
|
Here's an example of what the QAQC structure looks like in MATLAB (taken from the example MAT file below, salinity sensor shown to demonstrate inheritance):
EXAMPLE FILE: CambridgeBay_UnderwaterNetwork_ConductivityTemperatureDepth_20230101T000000Z_20230101T235959Z-NaN_clean.mat
Oceans 3.0 API filter: extension=mat
NetCDF files are a common, widely used binary format, that is compact, efficient and can be self-describing, especially when used with data standards and conventions, as described here: NetCDF. ONC's time series scalar NetCDF format is very similar to the above MAT format. Fun fact, MAT files and NetCDF files use the same file format basis: HDF5 (MAT7 and NetCDF4). The data product is still in the prototype and internal development phase - specifically, the exact metadata content is still under development. Once this is tested and released, the layout and structure will be documented below. Please contact us if you need to access this format.
Oceans 3.0 API filter: extension=nc
Discussion
To comment on this product, click Write a comment... below.