CRE-4150: fix race in Test_DataSource/Observe/records_telemetry#22344
Open
CRE-4150: fix race in Test_DataSource/Observe/records_telemetry#22344
Conversation
The test captures `ch := tm.ch` after `ds.Observe()` returns, but the production observation loop's next iteration calls MakeObservationScopedTelemetryCh again and overwrites tm.ch with a fresh empty channel before the test reads it. With the fast loopWakeCh pacing, the next iteration can begin in microseconds, so the test can end up ranging over an empty/closed channel and panicking on `telems[:3]` with "slice bounds out of range [:3] with capacity 0". Track every channel the mock creates in a `channels []chan any` slice (under the existing mu) and have the test read `channels[0]` instead of the latest tm.ch. The first channel is the one populated by the asserted first iteration (stream IDs 21/22/23), and it is provably non-empty by the time Observe returns since loopStartedCh only closes after the first MakeObservationScopedTelemetryCh call. Verified: 35 consecutive passes with -race -shuffle=on after the change.
Contributor
|
✅ No conflicts with other open PRs targeting |
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.





Summary
Fixes flake in
Test_DataSource/Observe/records_telemetry(Trunk:panic: runtime error: slice bounds out of range [:3] with capacity 0atdata_source_test.go:345).The test captures
ch := tm.chafterds.Observe()returns. Under fastloopWakeChpacing (microseconds), the production observation loop's next iteration callsMakeObservationScopedTelemetryChagain and overwritestm.chwith a fresh empty channel before the test reads it. The test then ranges over an empty/closed channel, getstelemsof length 0, and panics ontelems[:3].Fix
Track every channel the mock creates in a new
channels []chan anyslice (guarded by the existingtm.mu). Readchannels[0]instead of the racytm.ch— the first channel is the one populated by the asserted first iteration (stream IDs 21/22/23), and is provably non-empty by the timeObservereturns becauseloopStartedChonly closes afterMakeObservationScopedTelemetryChhas been called at least once.Test code only — no production-code changes.
Flaky test fixes
Test plan
Flakiness-classification: TEST (user override from AMBIGUOUS — Tier 1 string-match signals did not match the goroutine-shared-state race pattern, but stack trace and code analysis pinpoint a clear test-side race)
🤖 Generated with Claude Code