Skip to content

Actually fix the software factory shard 1/3 instantiate-validation flake (follow-up to #4782)#4802

Open
habdelra wants to merge 3 commits into
mainfrom
fix-instantiate-flake-followup
Open

Actually fix the software factory shard 1/3 instantiate-validation flake (follow-up to #4782)#4802
habdelra wants to merge 3 commits into
mainfrom
fix-instantiate-flake-followup

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

Summary

PR #4782 added diagnostic logging and a soft awaitSpecSearchable
gate, but didn't actually fix the underlying flake. After it merged
I confirmed the root cause and landed two more commits on the
branch that didn't make the merge — this PR brings them up.

The actual bug

The realm-server's indexer drops a Spec from _federated-search
whenever its linkedExamples loadLinks walk can't settle on the
target — both when the target file is missing (404) and when the
target card is in error_doc state. In the failing fixture, the Spec
linked to TagsCard/bad-example.json, the bad shape
(containsMany got a string) put that example in error_doc on
indexing, the Spec got silently disqualified from search, and
InstantiateValidationStep.discoverRealmSpecs came back empty.

Confirmed by stripping linkedExamples from the Spec in an
experiment — the bare Spec surfaces in search within seconds.

What this PR does

  1. Seed instantiate-validation fixtures via sync(waitForIndex:true)
    Stages the seed files in a temp dir and pushes via
    client.sync(..., { preferLocal: true, waitForIndex: true }). The
    _atomic upload appends ?waitForIndex=true, so the realm-server
    blocks on performIndex() before responding. Replaces the
    per-file client.write + polling-for-search approach with a
    deterministic "indexer is settled" boundary. (This was the
    intermediate fix I tried before realizing it didn't help on its
    own — but it's still the right shape and pairs cleanly with the
    real fix below.)

  2. Keep realm-side example well-formed; overwrite workspace copy
    Seeds the realm-side TagsCard/bad-example.json with a
    well-formed empty-tags placeholder so the Spec's loadLinks
    walk completes. New overwriteTagsExampleWithBadShape(workspaceDir)
    helper writes the broken containsMany: "not-an-array" shape
    into the workspace copy after client.pull. The validation
    pipeline reads example JSON from workspaceDir (see
    prepareExampleInstance in instantiate-execution.ts), not from
    the realm index, so the bad shape only needs to live in the
    workspace at the moment the step / in-memory tool reads it.

Both instantiate-validation.spec.ts:91 and
run-instantiate-in-memory.spec.ts:71 get the
overwriteTagsExampleWithBadShape call after client.pull.

Local verification

Both previously flaking tests now pass deterministically across
local runs. The realm-indexer behavior (Spec dropped from search
whenever its linkedExamples target won't settle) is worth filing
separately — this PR just unblocks the e2e on the fixture side.

Test plan

  • pnpm lint clean (per-package)
  • pnpm test:playwright tests/instantiate-validation.spec.ts tests/run-instantiate-in-memory.spec.ts — both target tests
    pass; two unrelated tests in the same in-memory spec file
    (197, 295) still flake locally with realm-server "database
    does not exist" errors during harness teardown (separate infra
    issue, not from this change)

🤖 Generated with Claude Code

habdelra and others added 2 commits May 12, 2026 20:36
The previous approach (per-file `client.write` + a 90s poll waiting
for the Spec to surface in `_federated-search`) was a polling race
against realm-side async indexing. CI run 25768256725 confirmed even
a 90s budget sometimes never sees the Spec, regardless of write
ordering or whether the bad-example file existed when the indexer
processed the Spec.

Switch to the realm-server's first-class read-after-write boundary:

- Stage all three files in a fresh temp dir.
- `client.sync(..., { preferLocal: true, waitForIndex: true })`
  pushes them as a single `_atomic` batch with `?waitForIndex=true`,
  which makes the realm-server's `_atomic` handler `await
  performIndex()` before responding.
- Cleanup the temp dir.

This trades one push round-trip for a deterministic
"indexer is settled" boundary — the call doesn't return until the
realm has processed every seeded file, so the downstream
`InstantiateValidationStep.discoverRealmSpecs` is guaranteed to see
the Spec on its first search hit.

Also adds one diagnostic log on the way out (realm file listing +
current Spec search hits) so if the realm's `performIndex` returns
without the Spec actually being searchable — the failure mode we
were trying to diagnose — we capture it in CI logs alongside the
`InstantiateValidationStep` warn-log added earlier in this PR.

Removed: the `awaitSpecSearchable` poll-and-warn helper. The
underlying mechanism it papered over (realm acks file write before
indexer surfaces it in search) is now handled at the realm side
via the atomic-upload waitForIndex query param.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diagnosed the actual flake on PR #4782 by stripping linkedExamples
from the Spec — the bare Spec then surfaced in `_federated-search`
within seconds, confirming the realm indexer drops a Spec whenever
its `linkedExamples` `loadLinks` walk can't settle on the target.
That happens both when the target is missing (404 — the variant
tried previously in this PR) and when the target is in error_doc
state (containsMany got a string — the original CI flake).

Either way, seeding the realm directly with a broken-shape example
silently disqualifies the Spec from search. So:

- Seed the realm-side `TagsCard/bad-example.json` with a well-formed
  empty-tags placeholder so the Spec's loadLinks walk completes and
  the Spec surfaces normally.
- New `overwriteTagsExampleWithBadShape(workspaceDir)` helper writes
  the broken `containsMany: "not-an-array"` shape into the workspace
  copy. The validation pipeline reads example JSON from the workspace
  path (see `prepareExampleInstance` in instantiate-execution.ts),
  not from the realm index, so the bad shape only needs to live in
  the workspace at the moment the step or in-memory tool reads it.
- Both `instantiate-validation.spec.ts:91` and
  `run-instantiate-in-memory.spec.ts:71` call the overwrite helper
  after `client.pull` so the pull doesn't immediately stomp the
  workspace file.

Local verification: both previously flaking tests now pass
deterministically across runs. Two unrelated tests in the same
in-memory spec file (197, 295) still flake locally with realm-server
"database does not exist" errors during harness teardown — separate
infra issue, not from this change.

The realm-indexer behavior (dropping a Spec from search whenever its
linkedExamples target won't settle) is a real bug worth filing
separately; this PR is just unblocking the e2e on the fixture side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from Copilot May 13, 2026 00:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR addresses flakiness in instantiate-validation e2e tests by making fixture seeding deterministic and preventing realm-side indexing from dropping the Spec due to an error_doc linked example.

Changes:

  • Seed fixture files via a single client.sync(..., { waitForIndex: true }) to establish a deterministic “indexer settled” boundary.
  • Seed a well-formed realm-side TagsCard/bad-example.json, then overwrite the workspace copy post-pull with the intentionally broken shape.
  • Update the two affected e2e specs to apply the workspace overwrite after client.pull.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
packages/software-factory/tests/helpers/instantiate-test-fixtures.ts Adds sync-based seeding with waitForIndex, introduces valid placeholder example, and provides a workspace overwrite helper for the broken shape
packages/software-factory/tests/instantiate-validation.spec.ts Overwrites the workspace example after pull so validation reads the broken shape without poisoning realm indexing
packages/software-factory/tests/run-instantiate-in-memory.spec.ts Same workspace overwrite after pull in the relevant in-memory e2e cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The helper writes each `files[].path` under a `mkdtempSync` staging
dir via `join(stagingDir, path)`. Callers in this file all pass
relative literal paths, so an escape isn't actually reachable today,
but the guard is near-zero-cost and prevents a future caller (or a
copy-paste outside this file) from accidentally pointing the seed at
a path outside the temp dir.

Reject `isAbsolute(path)` and any path segment of `..`. Both
conditions are checked before any disk touch so a bad input fails
loudly with a useful message rather than silently writing somewhere
unexpected on the test machine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from a team May 13, 2026 00:52
@habdelra habdelra changed the title Actually fix the instantiate-validation flake (follow-up to #4782) Actually fix the software factory shard 1/3 instantiate-validation flake (follow-up to #4782) May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants