Realm-server: parallelize cross-realm fetches in loadLinks (CS-11039)#4670
Realm-server: parallelize cross-realm fetches in loadLinks (CS-11039)#4670
Conversation
860402e to
c6d4f10
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR parallelizes cross-realm relationship fetches during loadLinks BFS traversal so cross-realm link resolution happens concurrently with the batched in-realm DB queries, reducing per-layer network round-trips.
Changes:
- Added
fetchCrossRealmLinks()helper to fan out cross-realm card fetches and validate responses. - Collected cross-realm link URLs per BFS layer and fetched them in Step 3 via
Promise.all. - Replaced per-entry cross-realm fetching in Step 4 with a synchronous lookup from a prefetched map.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The batching BFS already collapses in-realm relationship resolution to one batched DB query per recursion depth and runs the in-realm instance + file-meta queries concurrently. Cross-realm links were still serialized — Step 4 awaited each `this.#fetch(entry.linkURL)` per entry. Collect cross-realm URLs alongside the in-realm Sets in Step 2, dedupe, and fold a `Promise.all`-fanout over those URLs into the existing Step 3 `Promise.all` so they run concurrently with the batched DB queries. Step 4 then resolves cross-realm linkResources from a prefetched map rather than awaiting per-entry fetches. Wall-clock cross-realm resolution at one BFS depth collapses from N sequential fetches to one event-loop tick. Cycle / depth / omit / data-rewrite semantics are preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`globalThis.fetch` accepts string URLs directly; the helper was constructing a URL only to immediately consume one input that was already a canonical `URL.href`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7ab00c3 to
aa222c7
Compare
Host Test Results 1 files + 1 1 suites +1 2h 2m 6s ⏱️ + 2h 2m 6s Results for commit 85ad876. ± Comparison against earlier commit 5864c8d. Realm Server Test Results 1 files ± 0 1 suites +1 17m 4s ⏱️ + 17m 4s Results for commit 85ad876. ± Comparison against earlier commit 5864c8d. |
…cs-11039-parallelize-loadlinks
…cs-11039-parallelize-loadlinks
PR #4668 has been seeing intermittent host-test failures with generic "A network error occurred" or "Failed to fetch dynamically imported module" messages — both are runner-level / teardown-time flake signatures, and the same tests pass on PR #4670 (which includes all the BFS code from #4668), so they look like infrastructure flakes rather than code regressions. Adding diagnostics anyway so the next CI run captures concrete data: - An invocation id at loadLinks entry, repeated on every layer log line — lets us correlate logs from one loadLinks call across layers when investigating CI failures. - Layer summaries (size, entry count, in-realm/cross-realm breakdown). - Batch index lookup result counts + ms timing. - try/catch around `Promise.all(populateQueryFields)` and the batched index lookup that logs the failing context before re-throwing. - try/catch around the cross-realm fetch logging the URL and HTTP status (or thrown error message) before re-throwing. - Completion log with layers visited / included size. All under `this.#log.debug` (or `.warn` on the catch arms), so it follows the existing logger semantics and stays out of production noise unless the channel is enabled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…loadlinks # Conflicts: # packages/runtime-common/realm-index-query-engine.ts
Summary
Closes CS-11039.
Builds on the BFS rewrite from #4668 (CS-11038) — that work is included in this PR's diff.
RealmIndexQueryEngine.loadLinkswas the dominant cost behind_federated-search90s timeouts on staging. PR #4668 collapsed in-realm relationship resolution to one batched DB query per BFS depth and ran the in-realm instance + file-meta queries concurrently. Cross-realm fetches were still serialized — Step 4 of the BFS awaitedthis.#fetch(entry.linkURL)per entry, so a layer with N cross-realm links did N sequential network round-trips on the realm-server's event loop while every other request waited.This PR folds cross-realm fetches into Step 3's existing
Promise.allso they run concurrently with the batched in-realm DB queries. Step 4 then resolves cross-realmlinkResources from a prefetched map. Wall-clock cross-realm resolution at one BFS depth collapses from N sequential fetches to one event-loop tick.Why this matters (staging data anchor, 2026-04-28 → 2026-05-05)
The 30+ concurrent
_federated-searchflood at UTC 13:24:25 produced response times spanning <500ms to >90s. The long-tail responses are the ones that issued N×M sequential awaits while the event loop serviced other requests. With #4668 alone, in-realm link traversal collapses; with this PR, cross-realm link traversal also collapses. Together they're multiplicative:Promise.all)For a search whose results reference K unique cross-realm cards at one BFS depth, this PR turns K sequential
awaits into one event-loop tick, bounded by HTTP keep-alive on the cross-realm fetch path. The PG connection pool and HTTP keep-alive pool are the natural backpressure surfaces — no explicit semaphore needed.What changes
All in
packages/runtime-common/realm-index-query-engine.ts:Step 2 (
realm-index-query-engine.ts:1004,:1096) — while walking each entry's relationships, in addition to growinginRealmCardURLs/inRealmFileURLs, also grow a dedupedcrossRealmURLsSet. Dedup is strictly better than the prior code, which would re-fetch the same cross-realm URL once per entry that referenced it (sequentially!).Step 3 (
realm-index-query-engine.ts:1117-1137) — the existingPromise.allover instance + file-meta DB queries gains a third arm:fetchCrossRealmLinks(urls)(realm-index-query-engine.ts:878) — new private helper. Issuesthis.#fetchper unique URL insidePromise.all, validates withisSingleCardDocument, surfaces failures viaCardError.fromFetchResponse(same error path as before), and returnsMap<urlHref, CardResource<Saved>>.Step 4 (
realm-index-query-engine.ts:1160) — the cross-realm branch becomes a synchronous map lookup:linkResource = crossRealmMap.get(entry.linkURL.href). The rest of Step 4 (visited, descendStack, included push, relationship.data rewrite) is unchanged.Relationship to sister tickets
includeopt-out so callers can skiploadLinksentirely. Not in this stack.Promise.allof cross-realm fetches would still run alongside N×M sequential in-realm DB round-trips.Acceptance criteria (from ticket)
loadLinksouter caller issues per-resource link resolution in parallel — already true after Realm-server: batch link resolution in loadLinks (CS-11038) #4668's BFS; the outerfor...awaitoverdoc.datawas removed whenloadLinksswitched to takingrootResources: doc.data.loadLinksissue in parallel within a single layer — this PR.visited; cycles still detected —visitedsemantics unchanged.CardError.fromFetchResponsestill thrown;Promise.allsurfaces the first failure exactly as the original sequential code did.pnpm lintclean onruntime-common;loadLinks: trueintegration tests inindexing-test.ts:2829, 2973exercise cross-realm;load-links-batching-test.tsexercises in-realm batching.Out of scope
loadLinks: truedefault.Test plan
pnpm lintclean onruntime-common.🤖 Generated with Claude Code