Skip to content

Flaky test report: committed-code failures on 2026-05-21 #272

@andrross

Description

@andrross

Flaky test report: committed-code failures on 2026-05-21

Summary

6 distinct test failures were observed in committed-code (Timer/main and Post Merge Action) gradle-check builds in the 24 hours ending 2026-05-21T10:00 UTC.

Summary Table (sorted by total builds affected)

Test Builds Affected (all-time) First Seen Reproduced Locally Trend
TransferManagerRemoteDirectoryReaderTests.testOverflowDisabledAsynchronous 52 2025-07-28 No Stable (chronic, low-rate)
ClusterShardLimitIT.testOpenIndexOverLimit 51 2025-10-15 No Worsening (10 in May 2026)
KeywordTermsAggregatorTests.testStarTreeKeywordTerms 49 2025-01-29 Yes Stable (~2-6/month)
ClusterDisruptionIT.testAckedIndexing 39 2024-04-05 No Stable (chronic, low-rate)
NRTReplicationEngineTests.testAcquireLastIndexCommit 19 2025-10-13 Yes Worsening (8 in May 2026)
IngestPipelineFromKafkaIT.testTransformGeo 1 2026-05-20 No New (first occurrence)

Detailed Findings

1. KeywordTermsAggregatorTests.testStarTreeKeywordTerms

  • Build: 77832 (Timer/main)
  • Error: java.lang.AssertionError: expected:<0> but was:<1>
  • Seed: B36289C7F30751B7:8ADE6733C98A6020
  • Reproduced locally: Yes (deterministic with seed)
  • First seen: 2025-01-29
  • Total builds affected: 49
  • Pattern: Chronic flake, stable at 2-6 failures/month since Jan 2025. The seed is deterministic, suggesting a test-logic bug or over-strict invariant rather than a timing issue.

2. TransferManagerRemoteDirectoryReaderTests.testOverflowDisabledAsynchronous

  • Build: 77760 (Post Merge Action)
  • Error: java.lang.AssertionError: unexpected exception type thrown; expected:<java.io.IOException> but was:<org.apache.lucene.store.AlreadyClosedException>
  • Seed: BAF560FBF5C91AB1:DDBF807DEFE1C0B5
  • Reproduced locally: No
  • First seen: 2025-07-28
  • Total builds affected: 52
  • Pattern: Chronic flake. Peaked at 24 builds in Jul 2025, then dropped to 2-5/month. Not reproducible with seed, indicating a timing/concurrency issue (race between close and read).

3. NRTReplicationEngineTests.testAcquireLastIndexCommit

  • Build: 77754 (Timer/main)
  • Error: java.lang.AssertionError: expected:<2> but was:<1>
  • Seed: 45E9F70B02C418C3:FF61A09C2DD92CD2
  • Reproduced locally: Yes (deterministic with seed)
  • First seen: 2025-10-13
  • Total builds affected: 19
  • Pattern: Worsening. Was 0-3/month through Apr 2026, jumped to 8 in May 2026. The seed is deterministic, suggesting a test-logic bug. The May spike correlates with the mid-April 2026 CI runner migration to m7a.8xlarge.

4. ClusterDisruptionIT.testAckedIndexing

  • Build: 77731 (Post Merge Action)
  • Error: java.lang.AssertionError: failed to reach a stable cluster of [3] nodes
  • Seed: 4134A6DD9DA8F3B5
  • Reproduced locally: No
  • First seen: 2024-04-05
  • Total builds affected: 39
  • Pattern: Chronic flake, stable at 1-5/month. Classic disruption test timing sensitivity. Not reproducible with seed as expected for cluster disruption tests (seed controls disruption choice but not packet timing).

5. ClusterShardLimitIT.testOpenIndexOverLimit

  • Build: 77706 (Post Merge Action)
  • Error: java.lang.IllegalStateException: Some shards are still open after the threadpool terminated. Something is leaking index readers or store references.
  • Seed: E04E07ABFF50644E:B1E1771B71AAE338
  • Reproduced locally: No
  • First seen: 2025-10-15
  • Total builds affected: 51
  • Pattern: Worsening. Was 2-9/month, jumped to 10 in May 2026. Resource leak during teardown suggests a timing-sensitive cleanup issue. The May spike correlates with the CI runner migration.

6. IngestPipelineFromKafkaIT.testTransformGeo

  • Build: 77731 (Post Merge Action)
  • Error: org.awaitility.core.ConditionTimeoutException: Condition was not fulfilled within 1 minutes.
  • Seed: 4134A6DD9DA8F3B5:289E29B5864C63A3
  • Reproduced locally: No
  • First seen: 2026-05-20
  • Total builds affected: 1
  • Pattern: First occurrence. Could be a one-off infrastructure issue or a newly introduced flake. Needs monitoring over the next few days to determine if it recurs.

Methodology

  • Failures identified from the OpenSearch metrics cluster (gradle-check-* indices) filtered to Timer/main and Post Merge Action builds in the past 24 hours.
  • Historical patterns aggregated across all build types (including PR builds) using monthly date histograms with unique build cardinality.
  • Seeds extracted from Jenkins test report API (errorDetails/errorStackTrace fields).
  • Local reproduction attempted using the exact seed from the failing build on the current main branch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions