-
Notifications
You must be signed in to change notification settings - Fork 21
CDA-74 Created ADR for timeseries csv formatting #1634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rma-bryson
wants to merge
23
commits into
develop
Choose a base branch
from
feature/CDA-74-ADR-for-TimeSeries-CSV
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+124
−0
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
b8bf554
CDA-74 Created ADR for timeseries csv formatting
rma-bryson a20af56
CDA-74 Updated ADR for timeseries csv to include doc number and added…
rma-bryson d7d25a7
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 7151351
Updated numbering and sectioning of csv ADR
rma-bryson 738f8f6
Updated csv to be in code block
rma-bryson c7687e8
Removed option to include units in headers. While possible to achieve…
rma-bryson 44b3324
Updated to not include units comment if optionals turned off
rma-bryson e6c3ba7
Updated key points to reflect examples better.
rma-bryson e7315f1
CDA-74 - ADR revisions based on feedback
rma-bryson d37491b
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 6173210
CDA-74 - Updated decisions to use a table with justification
rma-bryson 75c5fc1
CDA-74 - Updates to use list-table
rma-bryson 4b7650e
CDA-74 - Adds note about why headers are always included and clients …
rma-bryson e08ab4a
CDA-74 - Splits up table to actual decision made
rma-bryson 3aa46da
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 81c3168
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 4bf7a61
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 7df49c1
CDA-74 - removed metadata-count line decision, added version-date to …
rma-bryson 450746e
CDA-74 - Removed serialization api decision as it doesn't need to be …
rma-bryson c566fe4
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 00f933e
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson 36035ea
Update docs/source/decisions/0008-timeseries-csv-format.rst
rma-bryson f139a86
CDA-112 - Includes decision to not include metadata-as-columns
rma-bryson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| ##### | ||
| CSV Format for TimeSeries | ||
| ##### | ||
|
|
||
|
|
||
| Summary | ||
| ======= | ||
|
|
||
| This ADR defines a standardized CSV representation for TimeSeries. It specifies a row-per-record CSV format that preserves essential metadata and ensures consistent ingestion by analytics, automation, and warehousing systems. | ||
|
|
||
|
|
||
| Opinions | ||
| ======== | ||
|
|
||
| Opinion 1 | ||
| --------- | ||
|
|
||
| @brysonspilman | ||
|
|
||
| Summary | ||
| ~~~~~~~ | ||
| Since the intended use of the CSV format is for retrieval only, a customized format that follows standardized csv practices is appropriate. | ||
|
|
||
| Key points | ||
| ~~~~~~~~~~ | ||
|
|
||
| .. list-table:: | ||
| :header-rows: 1 | ||
| :widths: 20 25 55 | ||
|
|
||
| * - Topic | ||
| - Decision | ||
| - Justification | ||
| * - Required columns | ||
| - Always include ``date-time`` and ``value``; include units in the value column header as parentheses (e.g., ``value (ft)``) | ||
| - Units should exist in exactly one canonical location in all modes. Conditionally adding them as metadata comments will cause confusion over the inconsistency | ||
| * - Optional columns | ||
| - Optional (off by default): ``quality-code``, ``data-entry-date`` | ||
| - Because headers are always included, optional columns can be toggled without breaking parsing. Clients should rely on column names, not indices. Given units are in the `value` header, clients will need to handle this appropriately to determine the correct column index. | ||
| * - Metadata fields | ||
| - Emitted as top-of-payload comments (``metadata-format=comments``) | ||
| - The following fields can be treated as metadata comments at top-of-payload: ``time-series-id``, ``office-id``, ``version-date``. These are optional (off by default). It is assumed that the only comments in the payload will be metadata comments, and as such, clients can parse out metadata by reading comment lines until the first non-comment line is reached. Metadata will not be provided as columns. | ||
| * - Units location | ||
| - Express units only in the value column header via parentheses (e.g., ``value (cfs)``) | ||
| - Do not include units as a separate column or in metadata comments. This avoids the anti-pattern of dual representation; units live in exactly one canonical location. Custom deserialization may be required to extract units from the header, which is preferable to duplicate representations. | ||
| * - Version-date encoding | ||
| - Use ``base`` for 1111-11-11T11:11, ``aggregate`` for aggregate versions, ISO-8601 timestamp for actual version dates, and omit the field if unversioned | ||
| - Matches CWMS-VUE behavior. A separate CSV column per case was rejected due to lack of use-cases and schema bloat. Note this requires custom serialization handling. | ||
| * - Column headers | ||
| - Always include headers | ||
| - RFC 4180 allows headers; including them keeps the format scalable if optional columns are introduced later and prevents reliance on fixed column indices. We will include a header param of ``header=present`` in the Accept header to explicitly indicate that headers are included, even though they will always be present. This allows for future flexibility if we ever need to emit headerless CSV for some reason. | ||
| * - Comments | ||
| - Treat lines beginning with ``#`` as comments | ||
| - While not part of RFC 4180, this convention is already used by CWMS endpoints (e.g., office and location-group) that return CSV, and is human-readable. | ||
| * - Column naming | ||
| - Kebab-case names | ||
| - Keeps naming consistent with JSON and XML. | ||
| * - Accept header for format and columns | ||
| - Use HTTP Accept header parameters to select date format and optional columns | ||
| - Default CSV serialization uses ISO-8601 strings. Examples: ``text/csv;date-format=ISO8601-Instant`` (default), ``text/csv;date-format=epoch-millis``. Use Accept header parameters to enable optional columns (e.g., ``quality=present``, ``data-entry-date=present``). If these were query params instead, toggling would be easier in a browser, but Accept keeps content negotiation consistent. | ||
| * - Quality representation | ||
| - ``quality`` (aka quality-code) is an optional integer bitmask | ||
| - A bitmask (integer) compactly represents multiple boolean flags with fast native bitwise operations; a ``byte[]`` adds overhead without improving expressiveness for fixed flag sets. | ||
| * - Nulls and missing values | ||
| - Missing values will be represented with an empty value field (null) and will have ``quality-code = 5``. Constants will not be used to represent missing values. | ||
| - Keeps behavior consistent with JSON and XML. | ||
| * - Encoding and delimiters | ||
| - UTF-8, comma delimiter, LF line endings | ||
| - Comma-only CSV follows RFC 4180 compliance. Tab/Pipe/semicolon delimiters will not be supported. | ||
| * - Record structure | ||
| - One row per record | ||
| - A record is a single date-time and value pair; ``quality-code`` and ``data-entry-date`` may be included as optional columns. ``version-date`` is also an attribute of the record but is covered under the optional metadata comments, not as a column. | ||
| * - Single TS per payload | ||
| - Do not mix multiple time-series IDs in one payload | ||
| - Ensures a payload represents exactly one time-series. | ||
|
|
||
| Example CSVs | ||
| ~~~~~~~~~~~~ | ||
|
|
||
| 1. All optionals turned off, and no metadata comments: | ||
|
|
||
| .. code-block:: text | ||
|
|
||
| date-time, value (cfs) | ||
| 2021-06-21T00:00:00Z, 0.0 | ||
| 2021-06-22T00:00:00Z, 1.0 | ||
| 2021-06-23T00:00:00Z, 2.0 | ||
| 2021-06-24T00:00:00Z, 3.0 | ||
|
|
||
| 2. All optionals turned off, with metadata-as-comments turned on: | ||
|
|
||
| .. code-block:: text | ||
|
|
||
| # time-series-id: ALAT2.Flow-Out.Inst.1Hour.0.Rev-SWF-REGI | ||
| # office-id: SWT | ||
| # version-date: aggregate | ||
| date-time, value (cfs) | ||
| 2021-06-21T00:00:00Z, 0.0 | ||
| 2021-06-22T00:00:00Z, 1.0 | ||
| 2021-06-23T00:00:00Z, 2.0 | ||
| 2021-06-24T00:00:00Z, 3.0 | ||
|
|
||
| 3. All optionals turned on (quality and data-entry-date), with metadata-as-comments turned off: | ||
|
|
||
| .. code-block:: text | ||
|
|
||
| date-time, value (cfs), data-entry-date, quality-code | ||
| 2021-06-21T00:00:00Z, 0.0, 2021-06-21T00:05:00Z, 5 | ||
| 2021-06-22T00:00:00Z, 1.0, 2021-06-22T00:05:00Z, 5 | ||
| 2021-06-23T00:00:00Z, 2.0, 2021-06-23T00:05:00Z, 5 | ||
| 2021-06-24T00:00:00Z, 3.0, 2021-06-24T00:05:00Z, 5 | ||
|
|
||
| Decision Status | ||
| =============== | ||
|
|
||
| (Status: proposed) | ||
|
|
||
|
|
||
| References | ||
| ========== | ||
|
|
||
| Related Types: cwms.cda.data.dto.TimeSeries, TimeSeries.Record | ||
| Issue/Discussion: https://github.com/USACE/cwms-data-api/issues/1525 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.