Skip to content

fix(tests): de-flake template-watcher and ProviderModelsSection timing races#248

Merged
plusplusoneplusplus merged 3 commits into
mainfrom
fix/flaky-tests-template-watcher-provider-models
May 30, 2026
Merged

fix(tests): de-flake template-watcher and ProviderModelsSection timing races#248
plusplusoneplusplus merged 3 commits into
mainfrom
fix/flaky-tests-template-watcher-provider-models

Conversation

@plusplusoneplusplus
Copy link
Copy Markdown
Owner

Summary

Fixes two CI-flaky tests that have been intermittently failing recent main merge runs. Both are test-only timing bugs, not product defects.

1. template-watcher.test.tsshould debounce multiple rapid events into a single callback

The test wrote 10 files immediately after watchWorkspace(). On a slow macOS runner, fs.watch (FSEvents) hasn't finished attaching, so all 10 events are missed and the callback fires 0 times (expected "spy" to be called 1 times, but got 0 times). Every other firing test in this file already waits 100–200ms for the watch to register.

Fix: add a 200ms registration settle before the rapid writes.

2. ProviderModelsSection.test.tsxrenders model cards in catalog view

The test waited only for provider-models-section, then synchronously queried provider-model-card. But the hook returns localModels, which is populated by a useEffect running after loading flips false — so the section renders one tick before the cards exist (Unable to find an element by: [data-testid="provider-model-card"]).

Fix: getAllByTestIdawait screen.findAllByTestId, waiting for the async-populated cards.

Verification

  • Both files green locally (24/24).
  • Ran the React test 5× consecutively — 12/12 each time.

🤖 Generated with Claude Code

…g races

Two CI-flaky tests, both test-only timing bugs (not product defects):

- template-watcher "should debounce multiple rapid events into a single
  callback" wrote 10 files immediately after watchWorkspace(). On slow
  macOS FSEvents the watch isn't attached yet, so every event is missed
  and the callback fires 0 times. Add a 200ms registration settle before
  the writes, matching the other firing tests in the file.

- ProviderModelsSection "renders model cards in catalog view" queried
  provider-model-card synchronously after only awaiting the section
  wrapper. The hook returns localModels, populated by a useEffect that
  runs after `loading` flips false, so the section can render one tick
  before the cards exist. Use findAllByTestId to wait for them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@plusplusoneplusplus plusplusoneplusplus enabled auto-merge (squash) May 30, 2026 04:31
plusplusoneplusplus and others added 2 commits May 29, 2026 22:53
The test waited a fixed 100ms after `shortcuts.deleteLogicalGroup`, then
invalidated the cache and reloaded — on slow macOS CI runners, file
watchers + persist hadn't settled, so the reload returned stale
configuration and the assertion failed with `true !== false`.

- Bump pre-refresh settle 200ms → 300ms (matches sibling 'should delete
  logical group through command with real tree item').
- Replace the fixed 100ms post-delete sleep with a poll loop (3s cap,
  100ms interval) that reloads + checks until the group is gone. Robust
  to runner speed without inflating happy-path runtime.
- Set this.timeout(10000) to match the sibling delete tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The "scroll-to-bottom button appears when scrolled up" test scrolled the
conversation to the top immediately after waitForConversation(). On CI the
initial-load auto-scroll (a deferred requestAnimationFrame in ChatDetail that
sets scrollTop = scrollHeight) had not yet fired, so it ran *after* the test
scrolled up — snapping back to the bottom and clearing isScrolledUp before the
assertion, leaving the button hidden (toHaveClass(/visible/) timed out).

Wait for the conversation to settle at the bottom with sufficient overflow
(> 150px) before scrolling up, so the pending rAF has already run and scrolling
to the top reliably yields dist > 100 and toggles the button visible.

Verified: target test 5/5 green; full Scroll suite 51/51 green over 3 repeats.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@plusplusoneplusplus plusplusoneplusplus merged commit a1e0102 into main May 30, 2026
66 of 68 checks passed
@plusplusoneplusplus plusplusoneplusplus deleted the fix/flaky-tests-template-watcher-provider-models branch May 30, 2026 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant