fix(tests): de-flake template-watcher and ProviderModelsSection timing races#248
Merged
plusplusoneplusplus merged 3 commits intoMay 30, 2026
Merged
Conversation
…g races Two CI-flaky tests, both test-only timing bugs (not product defects): - template-watcher "should debounce multiple rapid events into a single callback" wrote 10 files immediately after watchWorkspace(). On slow macOS FSEvents the watch isn't attached yet, so every event is missed and the callback fires 0 times. Add a 200ms registration settle before the writes, matching the other firing tests in the file. - ProviderModelsSection "renders model cards in catalog view" queried provider-model-card synchronously after only awaiting the section wrapper. The hook returns localModels, populated by a useEffect that runs after `loading` flips false, so the section can render one tick before the cards exist. Use findAllByTestId to wait for them. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The test waited a fixed 100ms after `shortcuts.deleteLogicalGroup`, then invalidated the cache and reloaded — on slow macOS CI runners, file watchers + persist hadn't settled, so the reload returned stale configuration and the assertion failed with `true !== false`. - Bump pre-refresh settle 200ms → 300ms (matches sibling 'should delete logical group through command with real tree item'). - Replace the fixed 100ms post-delete sleep with a poll loop (3s cap, 100ms interval) that reloads + checks until the group is gone. Robust to runner speed without inflating happy-path runtime. - Set this.timeout(10000) to match the sibling delete tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The "scroll-to-bottom button appears when scrolled up" test scrolled the conversation to the top immediately after waitForConversation(). On CI the initial-load auto-scroll (a deferred requestAnimationFrame in ChatDetail that sets scrollTop = scrollHeight) had not yet fired, so it ran *after* the test scrolled up — snapping back to the bottom and clearing isScrolledUp before the assertion, leaving the button hidden (toHaveClass(/visible/) timed out). Wait for the conversation to settle at the bottom with sufficient overflow (> 150px) before scrolling up, so the pending rAF has already run and scrolling to the top reliably yields dist > 100 and toggles the button visible. Verified: target test 5/5 green; full Scroll suite 51/51 green over 3 repeats. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes two CI-flaky tests that have been intermittently failing recent
mainmerge runs. Both are test-only timing bugs, not product defects.1.
template-watcher.test.ts› should debounce multiple rapid events into a single callbackThe test wrote 10 files immediately after
watchWorkspace(). On a slow macOS runner,fs.watch(FSEvents) hasn't finished attaching, so all 10 events are missed and the callback fires 0 times (expected "spy" to be called 1 times, but got 0 times). Every other firing test in this file already waits 100–200ms for the watch to register.Fix: add a 200ms registration settle before the rapid writes.
2.
ProviderModelsSection.test.tsx› renders model cards in catalog viewThe test waited only for
provider-models-section, then synchronously queriedprovider-model-card. But the hook returnslocalModels, which is populated by auseEffectrunning afterloadingflips false — so the section renders one tick before the cards exist (Unable to find an element by: [data-testid="provider-model-card"]).Fix:
getAllByTestId→await screen.findAllByTestId, waiting for the async-populated cards.Verification
🤖 Generated with Claude Code