-
Notifications
You must be signed in to change notification settings - Fork 12
Phase 2: job-scoped same-realm search cache during indexing #4791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
habdelra
wants to merge
10
commits into
main
Choose a base branch
from
cs-11115-phase2-job-cache
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
46fb78b
realm-server: job-scoped same-realm search cache during indexing
habdelra 0315d32
Merge branch 'cs-11115-phase1-in-flight-dedup' into cs-11115-phase2-j…
habdelra 1501a8c
job-scoped-search-cache tests: address lint feedback
habdelra 3e4794b
plumb x-boxel-job-id from worker → page → host fed-search
habdelra 960ab26
render-runner: prettier formatting fix in __boxelJobId injection
habdelra 04b7774
Phase 2 review fixes: CORS preflight cache + per-visit CDP RTT
habdelra 0f905e0
JobScopedSearchCache: bounded eviction + docstring corrections
habdelra c543f68
Merge remote-tracking branch 'origin/main' into cs-11115-phase2-job-c…
habdelra 82fb488
prerenderer: forward jobId on browser-restart retry path
habdelra c6f1e2c
job-scoped-search-cache test: fix FIFO-cap expectation
habdelra File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,205 @@ | ||
| import { | ||
| normalizeQueryForSignature, | ||
| sortKeysDeep, | ||
| type LinkableCollectionDocument, | ||
| type Query, | ||
| } from '@cardstack/runtime-common'; | ||
|
|
||
| // Default entry TTL. Picked to comfortably outlive a single indexing | ||
| // batch (workers cap from-scratch jobs at 6 min, incremental jobs are | ||
| // shorter) while bounding the worst case where a job ends without a | ||
| // NOTIFY-driven eviction reaching this process — a leaked entry persists | ||
| // at most this long. Cross-job collision is impossible because the cache | ||
| // key includes `jobId`, so a stale leak only hurts memory, never | ||
| // correctness. | ||
| const DEFAULT_TTL_MS = 10 * 60 * 1000; | ||
|
|
||
| // Hard cap on total entries across all jobs. When the cap is reached | ||
| // the FIFO-oldest entry is evicted to make room. Cap exists to bound | ||
| // worst-case memory: the `jobId` header is sanitized to a digits-only | ||
| // shape but the cache otherwise accepts any well-formed | ||
| // `(jobId, query, opts)` tuple from an authenticated caller, so a | ||
| // reader who mints synthetic jobIds and varied queries could otherwise | ||
| // grow the cache without bound for the full TTL window. Picked to | ||
| // comfortably accommodate the busiest realistic workload (a from- | ||
| // scratch reindex of a piranha-class realm fires hundreds of distinct | ||
| // queries within one job) while keeping worst-case footprint bounded | ||
| // to ~tens of MB. | ||
| const DEFAULT_MAX_ENTRIES = 5000; | ||
|
|
||
| type CachedEntry = { | ||
| result: LinkableCollectionDocument; | ||
| timer: ReturnType<typeof setTimeout>; | ||
| // Position in the FIFO eviction ring. Stored on the entry so a | ||
| // cache hit doesn't need a separate map lookup to know its slot. | ||
| fifoSeq: number; | ||
| }; | ||
|
|
||
| // Same-realm read cache used during indexing. Each entry is keyed by | ||
| // `(jobId, normalizedQuery, normalizedOpts)` and represents one | ||
| // `_federated-search` populate computed during the lifetime of one | ||
| // indexing job. Safe because within an indexing batch the writer | ||
| // touches `boxel_index_working`, not `boxel_index` — so every read of | ||
| // the same realm's `boxel_index` returns identical bytes until the | ||
| // batch's `applyBatchUpdates` swap fires. The job-id boundary scopes | ||
| // the cache to a single batch; a subsequent job hashes to different | ||
| // keys and never reuses a stale value. | ||
| // | ||
| // The handler gates entry into this cache on three conditions all | ||
| // holding: `x-boxel-job-id` present, `x-boxel-consuming-realm` present, | ||
| // and the request's `realms` array is exactly `[consumingRealm]`. | ||
| // Cross-realm reads bypass the cache because peer realms can swap | ||
| // independently — a cached read against a foreign realm could freeze | ||
| // a stale snapshot. Anonymous (no jobId) reads also bypass: those | ||
| // callers are not inside the batch's snapshot-stable read window and | ||
| // must always see live state. | ||
| // | ||
| // Entries store the *resolved* doc, not the in-flight promise. | ||
| // Concurrent same-key callers each run their own `populate` (Phase 1's | ||
| // in-flight dedup at `RealmIndexQueryEngine.searchCards` already | ||
| // coalesces the heavy inner SQL+loadLinks walk for same-realm calls | ||
| // arriving concurrently). The first to finish stores its result here; | ||
| // later sequential callers within the same job see the cached doc and | ||
| // short-circuit before re-entering `searchRealms`. | ||
| // | ||
| // Storing promises was tempting (it would also dedupe at this layer) | ||
| // but creates a tail-latency stall: a slow first populate blocks every | ||
| // later same-key caller past their render-timeout window, even when | ||
| // they could otherwise have run their own search in parallel and made | ||
| // progress. Resolved-only avoids that failure mode and keeps the | ||
| // benefit of sequential dedup, which is the win this cache exists for. | ||
| export class JobScopedSearchCache { | ||
| #byJob = new Map<string, Map<string, CachedEntry>>(); | ||
| // FIFO ring keyed by an ever-incrementing sequence so eviction | ||
| // ordering survives the (jobId, innerKey) name space. The oldest | ||
| // surviving sequence number is `#evictionCursor`; advances as the | ||
| // entry it points at is evicted (either via cap or its own TTL). | ||
| #fifo = new Map<number, { jobId: string; innerKey: string }>(); | ||
| #nextFifoSeq = 0; | ||
| readonly #ttlMs: number; | ||
| readonly #maxEntries: number; | ||
|
|
||
| constructor(opts?: { ttlMs?: number; maxEntries?: number }) { | ||
| this.#ttlMs = opts?.ttlMs ?? DEFAULT_TTL_MS; | ||
| this.#maxEntries = opts?.maxEntries ?? DEFAULT_MAX_ENTRIES; | ||
| } | ||
|
|
||
| async getOrPopulate(args: { | ||
| jobId: string; | ||
| query: Query; | ||
| opts: unknown | undefined; | ||
| populate: () => Promise<LinkableCollectionDocument>; | ||
| }): Promise<LinkableCollectionDocument> { | ||
| let innerKey = buildInnerKey(args.query, args.opts); | ||
| let jobMap = this.#byJob.get(args.jobId); | ||
| let existing = jobMap?.get(innerKey); | ||
| if (existing) { | ||
| return existing.result; | ||
| } | ||
|
|
||
| let result = await args.populate(); | ||
|
|
||
| // Late-arriving check: the populate may have just settled while a | ||
| // peer's populate (same key) also settled and stored its result | ||
| // first. Last-write-wins; either of the two resolved docs is | ||
| // equally valid since they came from the same `(jobId, query)` | ||
| // tuple against the same snapshot-stable boxel_index. | ||
| let currentJobMap = this.#byJob.get(args.jobId); | ||
| if (!currentJobMap) { | ||
| currentJobMap = new Map(); | ||
| this.#byJob.set(args.jobId, currentJobMap); | ||
| } | ||
| let prior = currentJobMap.get(innerKey); | ||
| if (prior) { | ||
| clearTimeout(prior.timer); | ||
| this.#fifo.delete(prior.fifoSeq); | ||
| } | ||
| let fifoSeq = this.#nextFifoSeq++; | ||
| let timer = setTimeout(() => { | ||
| this.#evictByKey(args.jobId, innerKey, timer); | ||
| }, this.#ttlMs); | ||
| if (typeof (timer as { unref?: () => void }).unref === 'function') { | ||
| (timer as { unref: () => void }).unref(); | ||
| } | ||
| currentJobMap.set(innerKey, { result, timer, fifoSeq }); | ||
| this.#fifo.set(fifoSeq, { jobId: args.jobId, innerKey }); | ||
|
|
||
| // Cap enforcement: evict FIFO-oldest until under the limit. Map | ||
| // preserves insertion order, so the first key is the oldest. We | ||
| // skip-over any keys whose entries are already gone (TTL fired) | ||
| // without rewriting the ring. | ||
| while (this.#fifo.size > this.#maxEntries) { | ||
| let oldestSeq = this.#fifo.keys().next().value; | ||
| if (oldestSeq === undefined) break; | ||
| let slot = this.#fifo.get(oldestSeq)!; | ||
| this.#fifo.delete(oldestSeq); | ||
| let jm = this.#byJob.get(slot.jobId); | ||
| let entry = jm?.get(slot.innerKey); | ||
| if (entry?.fifoSeq === oldestSeq) { | ||
| clearTimeout(entry.timer); | ||
| jm!.delete(slot.innerKey); | ||
| if (jm!.size === 0) { | ||
| this.#byJob.delete(slot.jobId); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return result; | ||
| } | ||
|
|
||
| #evictByKey( | ||
| jobId: string, | ||
| innerKey: string, | ||
| expectedTimer: ReturnType<typeof setTimeout>, | ||
| ): void { | ||
| let jm = this.#byJob.get(jobId); | ||
| if (!jm) return; | ||
| let entry = jm.get(innerKey); | ||
| if (entry?.timer === expectedTimer) { | ||
| this.#fifo.delete(entry.fifoSeq); | ||
| jm.delete(innerKey); | ||
| if (jm.size === 0) { | ||
| this.#byJob.delete(jobId); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Drop every entry for a given job. Wired in by the NOTIFY-driven | ||
| // eviction path so the cache releases memory as soon as the worker | ||
| // signals job completion, rather than waiting on TTL. | ||
| clearJob(jobId: string): void { | ||
| let jobMap = this.#byJob.get(jobId); | ||
| if (!jobMap) return; | ||
| for (let entry of jobMap.values()) { | ||
| clearTimeout(entry.timer); | ||
| this.#fifo.delete(entry.fifoSeq); | ||
| } | ||
| this.#byJob.delete(jobId); | ||
| } | ||
|
|
||
| // Total entry count across all jobs. Useful for tests + observability. | ||
| size(): number { | ||
| let total = 0; | ||
| for (let jm of this.#byJob.values()) { | ||
| total += jm.size; | ||
| } | ||
| return total; | ||
| } | ||
|
|
||
| jobIds(): string[] { | ||
| return [...this.#byJob.keys()]; | ||
| } | ||
| } | ||
|
|
||
| // Compose the per-job inner key. Excludes jobId since the outer Map is | ||
| // already partitioned by jobId — this keeps inner-key length bounded | ||
| // regardless of how the call site formats the jobId. Excludes the | ||
| // realms array (the cache gate already enforces same-realm-only), so | ||
| // two requests with `realms: [R]` produce the same inner key | ||
| // regardless of array identity. | ||
| function buildInnerKey(query: Query, opts: unknown | undefined): string { | ||
| return JSON.stringify([ | ||
| normalizeQueryForSignature(query), | ||
| opts ? sortKeysDeep(opts) : null, | ||
| ]); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.