Skip to content

Realm-server: make _federated-search JSON:API-compliant via include#4667

Open
habdelra wants to merge 7 commits intomainfrom
cs-11037-jsonapi-federated-search
Open

Realm-server: make _federated-search JSON:API-compliant via include#4667
habdelra wants to merge 7 commits intomainfrom
cs-11037-jsonapi-federated-search

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 5, 2026

Why

Closes CS-11037.

Two problems with _federated-search today:

  1. Not JSON:API-compliant. Per spec, a server "MUST NOT include unrequested resources in a compound document." Today the endpoint side-loads relationships unconditionally — there is no way for a client to ask for data[] only.
  2. Forces wasted work on consumers that do not use the links. The cards-grid in-prerender fallback path is the dominant offender: per the ticket's staging data (2026-04-28 → 2026-05-05), 110 of 126 render-timeouts (87%) match this fingerprint and each timeout burns the full 90s render budget — ≈2.75 worker-hours per 7d wasted on this one query shape. The consumer (live-prerendered-search.ts) only reads instance.id and instance.constructor; it never traverses relationships, but the server still resolves N×M sequential link queries per search.

What changes

_federated-search accepts an include field in the request body alongside the existing query fields:

{
  "filter": { ... },
  "sort": [ ... ],
  "realms": [ ... ],
  "include": [],            // NEW: JSON:API include semantics
  "fields": { ... }         // already supported
}

Tri-state behavior at Realm.searchsearchCards:

include value Server behavior
absent (today's clients) unchanged: loadLinks: true. Preserves every existing caller.
[] data[] only. No included[]. loadLinks is skipped entirely.
["fieldA", "fieldB"] only the named relationships are side-loaded. Uses the existing linkFields seam in realm-index-query-engine.ts. Essentially free given the seam already exists.

Why include lives outside Query

assertQuery whitelists keys and rejects unknown ones. Adding include to Query would force updating assertQuery + normalizeQueryForSignature and conflate query semantics with response-shape semantics. Instead, a new parseSearchRequestFromPayload(payload): { query, include } extracts include first, then runs assertQuery on the rest — mirroring how parsePrerenderedSearchRequestFromPayload already extracts prerenderedHtmlFormat / cardUrls / renderType before delegating. include is a JSON:API document concern, not a query filter concern.

Wire path

Client → realm-server middleware (parses payload once, stashes in ctxt state) → handle-search.ts (extracts { query, include }) → searchRealms(realms, query, { include })Realm.search(query, { include })searchCards(query, searchOpts). The gating decision (whether to skip loadLinks) lives at the Realm.searchsearchCards boundary — the smallest, lowest-risk seam.

Host side: getSearch(... { include })SearchResource.modify (also folds include into the previous-args comparison so re-renders with a different include re-fetch correctly) → runtimeStore.search → both private fetch helpers conditionally serialize include into the body.

Opt-out sites in this PR

Two callers explicitly send include: []:

Caller File Why safe
Cards-grid in-prerender fallback prerendered-card-search.gts:213-227 (renderContextSearchResource) The consumer (live-prerendered-search.ts:104,172,210) only reads instance.id + instance.constructor. This is the primary perf win — 87% of staging render-timeouts trace here.
boxel-cli search command boxel-cli/src/commands/search.ts:48-58 The CLI only ever surfaces result.data to its callers. Spread order preserves any caller-set include if a future caller wants links.

By transitive consequence, every software-factory caller that goes through BoxelCliClient.search also gets include: [] for free:

Caller File Reads
scripts/boxel-search.ts:159 data only
issue-scheduler.ts:214,231,313 data / data.length
instantiate-execution.ts:598 spec discovery — data fields
test-run-execution.ts:107 test discovery — data only
factory-tool-builder.ts:354 tool builder query — data only

Audit of remaining _federated-search callers

Caller-by-caller audit confirms the only production callers of the endpoint are the host store (fetchAndHydrateSearchResults / fetchSearchData) and the boxel-cli search command. The realm-server's per-realm searchResponse HTTP handler (realm.ts:3670) is not federated and stays untouched per the ticket's "Out of scope".

Host call sites that go through StoreService.search / getSearchResource are explicitly deferred to a follow-up (each needs per-template auditing before opting out):

  • commands/search-cards.ts:81 — returns instances to a command consumer
  • commands/create-specs.ts:184 — existence check + opaque return
  • components/card-catalog/modal.gts:379 — catalog list
  • components/operator-mode/detail-panel.gts:457 — detail panel side info
  • resources/search.ts:405 (generic SearchResource) — many consumers via getCards / getSearchResource
  • resources/search-data.ts:169 — many consumers via getSearchData

Compatibility

include absent → unchanged behavior. Every existing client (host store on the un-migrated paths, third-party callers, internal tooling) continues to receive fully-hydrated responses. Flipping the JSON:API-spec default (no side-load when include is absent) is explicitly deferred to a follow-up after all callers migrate — keeps each PR small and reviewable.

Companion work: browser-cacheable linked-card fetches (CS-11010)

Once a caller opts out of side-loading via include: [] (or include: ["someField"] for a partial set), any other relationships it later needs have to be fetched individually as GET /<card> with Accept: application/vnd.card+json. Today every one of those requests is served with cache-control: no-store, no-cache, must-revalidate and a fresh full-body response, so a page that does N follow-up link fetches pays the server's full JSON-assembly cost on every page view — each repeat visit re-runs the same DB reads + serialization.

CS-11010 (in progress, sibling ticket) makes vnd.card+json ETag-aware on published realms — the server emits an ETag based on the realm's last_published_at and short-circuits to a 304 before the JSON-assembly work runs, mirroring what the host-shell HTML route already does (packages/realm-server/server.ts:444-462, landed in 2abf744a45). With both PRs in, the cards-grid prerender path becomes:

  1. _federated-search returns data[] only (no included resources, no link-loading work) — this PR's win.
  2. Any follow-up linked-card fetches that the renderer needs hit the browser cache or get a 304, not the realm's JSON-assembly path — CS-11010's win.

So the cost shifted off _federated-search here doesn't just reappear as N individual unconditional GETs; CS-11010 ensures those GETs short-circuit on a warm cache.

Out of scope

  • Migrating the remaining host call sites listed above. Each needs per-template auditing.
  • Flipping the JSON:API default to "no side-load when include is absent."
  • loadLinks performance improvements where the side-load is still needed — sibling tickets CS-11038 / CS-11039.
  • _federated-search-prerendered — already correct (no loadLinks call in searchPrerendered).

Test plan

  • Realm-server integration tests in tests/server-endpoints/search-test.ts. Fixtures extended with a Friend.linksTo(Friend) card (Alice → Bob) so the assertions are non-trivial. Cases:
    • include absent → response has included[] populated (today's behavior preserved)
    • include: [] → response has no included[]; relationship metadata still present in data for client-side traversal
    • include: ["friend"] → only the named relationship is side-loaded
    • include: "friend" (malformed string) → 400 invalid-query
  • Host integration tests in store-test.gts mounting a virtualNetwork handler to capture _federated-search request bodies:
    • runtimeStore.search(query, realms, { include: [] }) → captured body has include: []
    • runtimeStore.search(query, realms) (no opts) → captured body has no include key, so server-side default behavior is preserved
  • pnpm lint clean per-package on every package edited (runtime-common, realm-server, host, boxel-cli).

🤖 Generated with Claude Code

habdelra and others added 2 commits May 5, 2026 18:57
Accept `include` in the request body. When absent, behavior is unchanged
(loadLinks: true) so all current callers keep working. When `include: []`
is sent explicitly, the server returns `data[]` only and skips loadLinks
entirely; when `include: ["fieldA", ...]` is sent, only the named
relationships are side-loaded (uses the existing `linkFields` seam in
realm-index-query-engine).

Cards-grid in-prerender fallback (renderContextSearchResource in
prerendered-card-search.gts) and the boxel-cli `search` command now opt
out via `include: []`. The cards-grid path is the primary perf win — 87%
of staging render-timeouts trace to its N×M sequential link fan-out — and
boxel-cli (plus all software-factory callers that go through it) only
ever reads `result.data`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ebab785bae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/host/app/services/store.ts
The host realm-server mock at tests/helpers/realm-server-mock/routes.ts
still parsed the entire `_federated-search` payload through
`parseSearchQueryFromPayload`, which goes through `assertQuery` and
rejects any unknown key. Without this fix, host integration tests that
send `{ include: [] }` would 400 with `unknown field in query: include`
instead of exercising the new behavior.

Switch to `parseSearchRequestFromPayload` (matching the real handler)
and forward `{ include }` into `searchRealms`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds JSON:API-style include semantics to the realm-server _federated-search endpoint (and host/CLI callers) to control relationship side-loading, enabling a high-impact performance opt-out for prerendered cards-grid searches while preserving existing default behavior.

Changes:

  • Introduces parsing/propagation of a top-level include array in _federated-search request bodies, with tri-state behavior (absent vs [] vs named relationships).
  • Updates Realm.search() and searchRealms() plumbing to translate include into link-loading options (loadLinks / linkFields).
  • Updates host + CLI call sites and adds integration tests covering default behavior, opt-out, selective includes, and malformed input.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/runtime-common/search-utils.ts Adds SearchRequest parsing (include) and threads include options through searchRealms().
packages/runtime-common/realm.ts Implements tri-state include behavior by mapping to loadLinks/linkFields for searchCards().
packages/realm-server/handlers/handle-search.ts Switches endpoint parsing to parseSearchRequest* and forwards include into realm searches.
packages/realm-server/tests/server-endpoints/search-test.ts Adds fixtures + tests for default side-loading, include: [], selective include, and malformed include.
packages/host/app/services/store.ts Forwards optional include into _federated-search request body (omits key when unspecified).
packages/host/app/resources/search.ts Exposes include as a resource arg and includes it in change-detection + search requests.
packages/host/app/components/prerendered-card-search.gts Opts prerendered cards-grid searches into include: [] to skip link side-loading.
packages/host/tests/integration/store-test.gts Tests that host requests include include: [] when specified and omit the key otherwise.
packages/boxel-cli/src/commands/search.ts Defaults CLI search requests to include: [] to avoid side-loading overhead.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/runtime-common/search-utils.ts
`parseSearchQueryFromPayload` is now only used internally by
`parseSearchRequestFromPayload` (post-include-strip), and
`parseSearchQueryFromRequest` has no remaining callers across the
monorepo. The exports were a forward-misuse footgun: any external caller
that passed a full request payload (now containing `include`) to
`parseSearchQueryFromPayload` would get `unknown field in query: include`
because `assertQuery` whitelists keys.

Drop the dead `parseSearchQueryFromRequest` entirely and make
`parseSearchQueryFromPayload` module-private.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Preview deployments

Host Test Results

    1 files  ±0      1 suites  ±0   1h 57m 36s ⏱️ - 4m 3s
2 567 tests ±0  2 552 ✅ ±0  15 💤 ±0  0 ❌ ±0 
2 586 runs  ±0  2 571 ✅ ±0  15 💤 ±0  0 ❌ ±0 

Results for commit 62aaaeb. ± Comparison against earlier commit b940f4d.

Realm Server Test Results

    1 files  ±    0      1 suites  ±0   16m 27s ⏱️ +11s
1 260 tests ±    0  1 260 ✅ ±    0  0 💤 ±0  0 ❌ ±0 
2 446 runs  +1 108  2 446 ✅ +1 108  0 💤 ±0  0 ❌ ±0 

Results for commit 62aaaeb. ± Comparison against earlier commit b940f4d.

@habdelra
Copy link
Copy Markdown
Contributor Author

habdelra commented May 6, 2026

packages/boxel-cli/src/commands/search.ts
Outdated
Comment on lines +54 to +57
let body: Record<string, unknown> = {
realms,
include: [],
...query,
@chatgpt-codex-connector
chatgpt-codex-connector Bot
4 hours ago
P2 Badge Preserve relationship data for CLI search results

When the CLI searches cards that have linked relationships, defaulting every request to include: [] changes the returned data[] shape even though SearchResult only exposes data: the server's link-loading path does not just populate top-level included, it also mutates each root resource's relationships to add relationship.data identifiers after resolving links.self. With this default, CLI/API clients that read linked card ids from result.data[*].relationships now get only relationship links unless they know to opt back in by passing an include key inside the query.

@habdelra
Copy link
Copy Markdown
Contributor Author

habdelra commented May 6, 2026

Thanks for forwarding this — checked it against the actual SF/CLI consumers and it's a defensive guard for non-reachable state, not a real reachable bug.

Codex's technical claim is accurate: loadLinks does mutate each root resource's relationship.data (realm-index-query-engine.ts:1072) to add { type, id } after resolving links.self. With include: [] that mutation is skipped.

But every current consumer of CLI search results that reads card.relationships reads links.self, not data.id:

  • software-factory/src/issue-scheduler.ts:401extractLinksToManyIds: let linkUrl = rel?.links?.self
  • software-factory/src/instantiate-execution.ts:695extractLinkedExamples: if (!entry?.links?.self) break
  • software-factory/src/parse-execution.ts:856 — same extractLinkedExamples helper

relationship.links.self is part of the indexed pristine_doc and survives regardless of loadLinks (asserted in the new realm-server test: relationship link is present in data even without side-load).

A future caller that does result.data[i].relationships.foo.data.id would lose data, but the spread order in search.ts already lets that caller pass their own include in the query to opt back in:

let body: Record<string, unknown> = {
  realms,
  include: [],
  ...query,    // ← caller-set include wins
};

So the contract is: by default the CLI does the cheap thing; callers who need link resolution can opt in. Worth a JSDoc comment on the search() function noting the trade-off — happy to add that if you'd like — but I don't think the default itself needs to change.

@habdelra
Copy link
Copy Markdown
Contributor Author

habdelra commented May 6, 2026

Heads-up from CI investigation on the stack (#4668#4670): the loadLinks BFS rewrite in #4668 regresses two host integration tests that pass on main:

Test File Notes
Integration | realm indexing: can index a field with a cycle in the linksTo field packages/host/tests/integration/realm-indexing-test.gts:3388 Hassan ↔ Mango cycle, both in same realm
Integration | realm indexing: can index a field with a cycle in the linksToMany field :3859 Same shape, linksToMany
Acceptance | code submode tests > single realm: clicking a linksToMany field in playground panel opens the linked card JSON code-submode-tests UI-side deepEqual of a loadLinks result

All three fail with Actual: [object Object] Expected: [object Object] (no message), which means a deepEqual is comparing two near-identical structures whose diff isn't being printed by qunit. Failure shape is consistent — the BFS produces a subtly-different included[] or relationship.data from the recursive original in cycle-back-to-root cases.

Confirmed pre-existing: same exact tests fail on cs-11038's 6ddece7e0b (Host Tests 16/20 + 20/20). They pass on main (47e688436a). #4670 inherits the failure since it's stacked on #4668; my change only touches the cross-realm branch and can't reach these in-realm cycle paths.

This blocks #4668 → main, which in turn blocks #4670. The Codex maxLinkDepth boundary fix (e218d21421) didn't catch this case. Likely candidates for the BFS bug:

  • visited is added at root-resource enqueue and again at layer expansion; the else if (omitSet.has(linkResource.id) || included.find(...)) branch only sets foundLinks — it doesn't push to included[] when the cycle-target is the root (which is in omitSet, not included). The recursive original may have produced a different rewrite for that case.
  • The cloneDeep of linkResource happens before any relationship.data rewrite; nested relationships discovered later don't propagate into the cloned included[] entry.

Posting here per Hassan's request so the context isn't lost when #4667 lands first. CC @habdelra.

@habdelra
Copy link
Copy Markdown
Contributor Author

habdelra commented May 6, 2026

Got it — those two realm-indexing-test.gts tests + the code-submode deepEqual are all in #4668's BFS rewrite path, not in anything #4667 touches.

Confirmed scope-isolation for #4667:

  • Realm.search plumbing only added an opts.include arm; the existing loadLinks: true call path is byte-identical when include is absent.
  • The linkFields argument forwarded for partial-include reuses the existing seam (realm-index-query-engine.ts:914) — no traversal-order changes.
  • None of the failing tests exercise _federated-search; they're pure indexing tests calling searchCards directly with the recursive loadLinks.

So the cycle-back-to-root regression you're tracking lives entirely in #4668's BFS, and #4667 landing first doesn't affect that investigation. I'll watch the monitor for any in-PR feedback on this code path here.

Single-run measurement on b940f4d showed Acceptance | Commands tests at
66.8 MB (+50.4 MB over baseline, 0.4 MB over the +50MB hard threshold).
Sibling PRs on the same baseline measured ~50.5 MB on this same module
on their first run. Triggering a fresh measurement to confirm whether
this PR consistently regresses or whether the original was a single-run
variance hit on a noisy module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra
Copy link
Copy Markdown
Contributor Author

habdelra commented May 6, 2026

Pause on merging this until we have a deeper discussion around how to handle links for searches during indexing time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants