Summary
Two distinct tests failed against committed code (Timer builds on main) in the 24-hour window ending 2026-05-23T10:00 UTC. Both are chronic flaky tests that did not reproduce locally with the original seed, indicating timing-dependent failures.
Failing Tests
1. MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}
| Field |
Value |
| Build |
77997 |
| Seed |
DE77915B3B9483AC |
| Module |
qa/mixed-cluster (BWC test against v3.6.1) |
| Error |
expected [2xx] but got [408 Request Timeout] — cluster was red with 51 unassigned shards |
| Reproduced locally |
No — passed with seed |
| First seen |
2024-03-25 |
| Total builds affected |
135 |
| Pattern |
Worsening — quiet Aug 2025–Mar 2026, resurfaced Apr 2026 (9 builds) and May 2026 (11 builds), coinciding with m7a.8xlarge runner migration |
2. SearchRestCancellationIT.testAutomaticCancellationDuringFetchPhase
| Field |
Value |
| Build |
77989 |
| Seed |
B6D9D10CA83177C3 |
| Module |
qa/smoke-test-http |
| Error |
AssertionError in ensureSearchTaskIsCancelled — assertBusy timed out waiting for task cancellation |
| Reproduced locally |
No — passed with seed |
| First seen |
2024-04-04 |
| Total builds affected |
205 |
| Pattern |
Significantly worsening — steady low-level flakiness since Apr 2024, major spike Nov 2025 (41 builds), now at worst-ever May 2026 (33 builds in 23 days). Clearly exacerbated by faster CI runners. |
Summary Table
| Test |
Builds Affected |
First Seen |
Trend |
Reproduced |
SearchRestCancellationIT.testAutomaticCancellationDuringFetchPhase |
205 |
2024-04-04 |
Significantly worsening |
No |
MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index} |
135 |
2024-03-25 |
Worsening (resurfaced) |
No |
Notes
- Neither test reproduced with the original seed, which is expected for timing-dependent failures. The seeds control randomization of test parameters but not thread scheduling, network timing, or GC pauses.
- Both tests show increased failure rates starting April 2026, consistent with the CI runner migration from m5.8xlarge to m7a.8xlarge (faster CPUs amplify race windows).
SearchRestCancellationIT is the higher-priority target: 205 builds affected and actively worsening. The failure is an assertBusy timeout waiting for search task cancellation, suggesting the cancellation propagation path has a timing sensitivity that faster hardware exposes more frequently.
MixedClusterClientYamlTestSuiteIT failure is a cluster health timeout in a BWC mixed-cluster scenario — likely related to shard allocation timing in a heterogeneous cluster.
Summary
Two distinct tests failed against committed code (Timer builds on
main) in the 24-hour window ending 2026-05-23T10:00 UTC. Both are chronic flaky tests that did not reproduce locally with the original seed, indicating timing-dependent failures.Failing Tests
1.
MixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}DE77915B3B9483ACqa/mixed-cluster(BWC test against v3.6.1)expected [2xx] but got [408 Request Timeout]— cluster was red with 51 unassigned shards2.
SearchRestCancellationIT.testAutomaticCancellationDuringFetchPhaseB6D9D10CA83177C3qa/smoke-test-httpAssertionErrorinensureSearchTaskIsCancelled—assertBusytimed out waiting for task cancellationSummary Table
SearchRestCancellationIT.testAutomaticCancellationDuringFetchPhaseMixedClusterClientYamlTestSuiteIT.test {p0=cluster.health/10_basic/cluster health with closed index}Notes
SearchRestCancellationITis the higher-priority target: 205 builds affected and actively worsening. The failure is anassertBusytimeout waiting for search task cancellation, suggesting the cancellation propagation path has a timing sensitivity that faster hardware exposes more frequently.MixedClusterClientYamlTestSuiteITfailure is a cluster health timeout in a BWC mixed-cluster scenario — likely related to shard allocation timing in a heterogeneous cluster.