Skip to content

Perf: 1Million Challenge for CDA TS GET#1692

Open
krowvin wants to merge 18 commits intodevelopfrom
feature/timeseries-read-perf
Open

Perf: 1Million Challenge for CDA TS GET#1692
krowvin wants to merge 18 commits intodevelopfrom
feature/timeseries-read-perf

Conversation

@krowvin
Copy link
Copy Markdown
Collaborator

@krowvin krowvin commented Apr 14, 2026

Work on #1685

Improve unfiltered /timeseries read performance + perf coverage

  • move the main unfiltered /timeseries read path off the Oracle retrieve_ts_out_tab hot path and onto direct AV_TSV_DQU/jOOQ reads while preserving retrieve_ts response semantics in Java
  • keep the direct path aligned with Oracle behavior using integration parity coverage, including a DST-window regression case for regular series
  • Java/Gradle benchmark harness
  • add explicit fixture teardown for the benchmark runner so local perf runs exit cleanly and leave the workspace in a predictable state

Benchmark

Local single-request benchmarks against the Docker/Testcontainers Oracle dev environment, using page-size equal to the requested row count:

  • legacy path, 250000 points: 40.679s
  • new direct path, 250000 points: 4.652501s
  • legacy path, 500000 points: HTTP 500

The current Java benchmark harness also confirms the direct path now returns the expected count for the large seeded series:

  • direct path, 250000 points: 250000 reported rows, correct final timestamp, 4.652501s

The benchmark task writes JSON artifacts under load_data/performance/results/.

Validation

  • ./gradlew :cwms-data-api:integrationTests --tests cwms.cda.api.TimeSeriesDirectReadParityIT
  • ./gradlew :cwms-data-api:timeseriesReadBenchmark -Pbenchmark.pointCount=250000 -Pbenchmark.runs=1 -Pbenchmark.forceReseed=true

Notes

  • scope here is the main unfiltered /timeseries read path; /timeseries/filtered is still on the legacy path
  • could move to larger streaming/lower-materialization for more improvements

Seeding 250,000 TS values for SPK and reading them back via CDA returns in 4.6s

Before changes

Current 500k+ Fails
Current 250k in 40.679s

Changes

250k in 4.793s
500k in 7.498s
1,000,000 in about 14.9s

Tool Usage

  • AI Tools used

@krowvin krowvin requested a review from MikeNeilson April 14, 2026 12:01
@krowvin
Copy link
Copy Markdown
Collaborator Author

krowvin commented Apr 14, 2026

Test failures appear to be related to the updated CWMS DB schema / AV_LOC_ALIAS/LOCATION_CODE changes not the src specific to this PR.

Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Copy link
Copy Markdown
Contributor

@MikeNeilson MikeNeilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually still looks reasonable. I don't have much more to add over what Adam has already said. Few suggestions on readability.

Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
Comment thread cwms-data-api/src/test/java/helpers/TimeSeriesReadBenchmark.java Outdated
Comment thread cwms-data-api/build.gradle Outdated
Copy link
Copy Markdown
Collaborator Author

@krowvin krowvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a few code change notes to be aware of on this round

Comment thread cwms-data-api/src/main/java/cwms/cda/data/dao/TimeSeriesDaoImpl.java Outdated
@krowvin krowvin requested review from MikeNeilson and removed request for MikeNeilson April 30, 2026 21:34
Copy link
Copy Markdown
Contributor

@MikeNeilson MikeNeilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable @adamkorynta should determine if is comments were appropriately resolve.

I think so, but he may have been thinking of something else.

int pageSize = queryParamAsClass(ctx, new String[]{PAGE_SIZE },
Integer.class, DEFAULT_PAGE_SIZE, metrics,
name(TimeSeriesController.class.getName(), GET_ALL));
pageSize = Controllers.validateTimeSeriesPageSize(pageSize);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you pass queryParamAsClass(....) into validateTimeSeriesPageSize you can have pageSize be final here and avoid the extra assignment later.

Copy link
Copy Markdown
Collaborator Author

@krowvin krowvin May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing the original code. Not a huge deal for this one though, more a style choice than anything.

@krowvin
Copy link
Copy Markdown
Collaborator Author

krowvin commented May 1, 2026

I asked @inguyen314 for a list of TSIDs I could potentially use for testing against the 1M and current version of the api for integration testing

MVS TSIDs (Click to expand)
Carlyle Lk-Kaskaskia.Flow-Out.Inst.~1Day.0.lakerep-rev-forecast
Lk Shelbyville-Kaskaskia.Flow-Out.Inst.~1Day.0.lakerep-rev-forecast
Mark Twain Lk-Salt.Flow-In.Inst.~1Day.0.lakerep-rev-forecast
Mark Twain Lk-Salt.Flow-Out.Inst.~1Day.0.lakerep-rev-forecast
Mark Twain Lk-Salt.Flow.Inst.~1Day.0.lakerep-rev-forecast
Rend Lk-Big Muddy.Flow-Out.Inst.~1Day.0.lakerep-rev-forecast
Wappapello Lk-St Francis.Flow-Out.Inst.~1Day.0.lakerep-rev-forecast

Birds Point-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Cairo-Ohio.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Cape Girardeau-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Chester-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Commerce-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Engineers Depot-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Florence-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Grafton-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Grand Tower-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Grays Pt-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Hardin-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Herculaneum-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Jefferson Brks-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
La Grange TW-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 22 TW-Mississippi.Flow.Inst.~1Day.0.netmiss-fcst-versioned
LD 22 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 24 Pool-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 24 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 25 Pool-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 25 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 27 Pool-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
LD 27 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Louisiana-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Mel Price Pool-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Mel Price TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Meredosia-Illinois.Flow.Inst.~1Day.0.netmiss-fcst-versioned
Meredosia-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Moccasin Springs-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Mosier Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Nav TW-Kaskaskia.Flow.Inst.~1Day.0.netmiss-fcst-versioned
Nav TW-Kaskaskia.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Price Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Red Rock Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
St Louis-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Thebes-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Thompson Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-versioned
Valley City-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-versioned


Birds Point-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Cairo-Ohio.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Cape Girardeau-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Chester-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Commerce-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Engineers Depot-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Grand Tower-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Grays Pt-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Herculaneum-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Hermann-Missouri.Flow.Inst.~1Day.0.netmiss-fcst-28-versioned
Jefferson Brks-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
La Grange TW-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 22 TW-Mississippi.Flow.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 22 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 24 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 25 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 27 Pool-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
LD 27 TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Mel Price TW-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Meredosia-Illinois.Flow.Inst.~1Day.0.netmiss-fcst-28-versioned
Meredosia-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Moccasin Springs-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Nav TW-Kaskaskia.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Price Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Red Rock Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
St Louis-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Thebes-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Thompson Ldg-Mississippi.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned
Valley City-Illinois.Stage.Inst.~1Day.0.netmiss-fcst-28-versioned

Copy link
Copy Markdown
Contributor

@MikeNeilson MikeNeilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm overall happy with it, some of the test code that loads the large data isn't using bind variables or otherwise efficient operations. Bind vars are a must, but efficiency could go either way but it likely affecting performance results given how abusive it would be the the query planner cache in the database.

int pageSize = queryParamAsClass(ctx, new String[]{PAGE_SIZE },
Integer.class, DEFAULT_PAGE_SIZE, metrics,
name(TimeSeriesController.class.getName(), GET_ALL));
pageSize = Controllers.validateTimeSeriesPageSize(pageSize);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing the original code. Not a huge deal for this one though, more a style choice than anything.

List<SeedRow> sortedRows = new ArrayList<>(rows);
sortedRows.sort(Comparator.comparing(seedRow -> seedRow.dateTime));

for (SeedRow row : sortedRows) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should really be handled as a batch operation. Though if it wasn't exceptionally slow in your initial testing we can patch it up later.

e.g.

try ( stmt = prepare )
{
  for (var row: rows)
  {
    stmt.set...

    stmt.add
  }
  stmt.executeBatch
}

alternatively, you call executeBatch every X rows, where X is found through experimentation.

All let's the query planner be a bit more efficient.


for (SeedRow row : sortedRows) {
int year = OffsetDateTime.ofInstant(row.dateTime, ZoneOffset.UTC).getYear();
String sql = "insert into at_tsv_" + year
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this should be fixed, even in tests... bind variables, always bind variables.

? toOracleDateExpression(versionDate)
: "null";
String maxVersionFlag = versionDate != null ? "'F'" : "'T'";
String sql = "select round((date_time - date '1970-01-01') * 86400000) as date_time_ms,"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bind variables

@krowvin krowvin force-pushed the feature/timeseries-read-perf branch from 341f74b to 5696cd7 Compare May 6, 2026 14:49
@krowvin krowvin requested a review from MikeNeilson May 6, 2026 21:21
private static final FieldMapping AV_CWMS_TS_ID2_FIELD_MAP = new CwmsTsId2FieldMapping();
private static final FieldMapping AV_CWMS_TS_ID_FIELD_MAP = new CwmsTsIdFieldMapping();

private static final class DirectReadMetadata {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is in our checkstyle, but java conventions generally put inner classes at the bottom of parent classes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can add that to the checkstyle file, we definitely should.

@krowvin krowvin requested a review from adamkorynta May 8, 2026 04:59
MikeNeilson pushed a commit that referenced this pull request May 8, 2026
Emphasize the importance of using bind variables in SQL queries to
prevent string concatenation / security issues.

## Summary

Mike noticed string concat in one of the test files in #1692
@MikeNeilson
Copy link
Copy Markdown
Contributor

Failing test needs to be fixed. Suspect some of the logic change isn't getting it down the right code path:

TimeseriesControllerTestIT (schema: 25.07.01) > TimeseriesControllerTestIT.test_wrong_units (schema: 25.07.01) FAILED
    java.lang.AssertionError: 1 expectation failed.
    Expected status code is <400> but was <200>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants