fix(storage): ServerCredentialStore put-race + vault orphan on rollback failure (sec)#540
Merged
Merged
Conversation
Coverage Report
File CoverageNo changed files found. |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses correctness/safety issues in ServerCredentialStore by preventing visibility of mid-flight new credential writes and by improving rollback/orphan detection paths when vault writes fail. It also tightens deleteAll() scoping behavior and adds tests for the new semantics.
Changes:
- Add
pending/orphanedAtmetadata fields and treatpendingrows as absent in readers (get,has,listMetadata,keys). - Rework
put()for new entries to be metadata-first with apendingflag, plus an orphan-marker fallback when rollback metadata deletion fails. - Tighten
deleteAll()to filter by key prefix first and add new tests for concurrency, pending invisibility, orphan marker behavior, and scope isolation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| packages/storage/src/credentials/ServerCredentialStore.ts | Introduces pending/orphan metadata, reorders new-entry put() to metadata-first, adjusts readers and scope filtering. |
| packages/storage/src/credentials/ServerCredentialStore.test.ts | Adds tests for pending-row invisibility, rollback/orphan marker behavior, concurrency, and deleteAll scope isolation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+140
to
+141
| // Commit: clear the pending flag. | ||
| await this.metadata.put(id, { ...baseRow, pending: false }); |
Comment on lines
+145
to
+150
| // Update is not atomic; concurrent get() may observe either old or new | ||
| // value but never a torn read. We overwrite the vault first; if the | ||
| // metadata update then fails we rethrow without rolling the vault back | ||
| // because the prior metadata row still points at the same vault id. | ||
| await this.vault.setSecret(id, value); | ||
| await this.metadata.put(id, baseRow); |
Comment on lines
+188
to
+192
| // Defensive: the rollback path went through vault.deleteSecret OR the | ||
| // vault was never written (depending on which write failed first). | ||
| // The current implementation writes metadata first, so vault is never | ||
| // touched if metadata.put throws — accept either outcome. | ||
| expect(deleteSpy.mock.calls.length >= 0).toBe(true); |
Comment on lines
+154
to
+156
| // committed value or undefined — never the not-yet-committed "v2". | ||
| const observed = await store.get("k"); | ||
| expect(observed === "v1" || observed === undefined).toBe(true); |
…ate-path comment Addresses Copilot review on PR #540: - New-entry put(): if the commit-step `metadata.put({pending:false})` throws after a successful vault.setSecret, the row stayed `pending:true` (invisible to readers) while the vault held bytes — same orphan failure mode as the vault-rollback-fails path, with no operator signal. Wrap the commit write, best-effort persist an orphan marker, and throw a wrapped error. - Update path comment overclaimed "never a torn read". The actual window is a stale-metadata read: a concurrent get() between vault.setSecret and metadata.put returns the new vault value alongside the OLD metadata (updatedAt, expiresAt, label, provider). Vault reads themselves are atomic. Comment rewritten to describe this honestly and to point at versioned vault ids as the clean fix.
…t-step orphan path Addresses Copilot review on PR #540: - Rename and tighten the in-flight concurrency test to assert the prior committed value EXACTLY (the second put() is an update, so metadata stays non-pending and get() must return "v1"). The old wording allowed undefined, which would mask a regression that treats committed rows as absent. - Replace the trivially-true `deleteSpy.mock.calls.length >= 0` with explicit `setSpy.not.toHaveBeenCalled()` / `deleteSpy.not.toHaveBeenCalled()` to pin the metadata-first behavior (vault is never touched when the pending metadata write throws). - Add a regression test for the commit-step orphan path: first metadata.put (pending:true) succeeds, vault.setSecret succeeds, second metadata.put (commit pending:false) throws. Assert the wrapped "commit failed" error, that the vault retains the bytes, and that the metadata row is a sticky orphan marker.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes two correctness/safety bugs in
ServerCredentialStore(packages/storage/src/credentials/ServerCredentialStore.ts).Bug 1 — put-during-get race for new entries
The previous
put()wrote the vault first, then the metadata row. A concurrentget()for a brand-new key could observemetadata.get(id) === undefinedand returnundefined, whilehas()would shortly afterwards report the same key as present — but more importantly, for an UPDATE concurrent with aget(), the metadata row pointed at the OLD logical value while the vault had ALREADY been overwritten with the NEW value, soget()could return the new ciphertext under the oldupdatedAt— a torn read. For a brand-new entry, the inverse hole existed: the vault had bytes but no metadata, so the value was unreachable and orphaned if the metadata write then failed.Bug 2 — silent vault orphan on rollback failure
The previous rollback path used
await this.vault.deleteSecret(id).catch(() => undefined), silently swallowing rollback errors. The vault could be left holding plaintext for an id with no metadata pointing at it, with no signal to operators.Chosen scheme
put()for new entries, with apending: trueflag on the metadata row:metadata.put(id, { ...row, pending: true })await vault.setSecret(id, value)metadata.put(id, { ...row, pending: false }).get,has,listMetadata,keys) treatpending === trueas absent, so the in-flight row is invisible until committed.vault.setSecretthrows, attemptmetadata.delete(id); if THAT also throws, persist{ pending: true, orphanedAt: new Date().toISOString() }so the row remains invisible to readers but is discoverable by an operator/reconciler, and throw a wrappedError("ServerCredentialStore.put: vault write failed and rollback may have orphaned vault id <id>", { cause: vaultError }).get()may observe either the old or new value, but never a torn read of vault+metadata (get()reads metadata first, then the vault id — both writes are for the same id).deleteAll()scope-tightened. The internalscopedIds()helper now filters viaentry.key.startsWith(this.prefix)first (cheap string check) and re-assertsuserId/projectIdfrom the row as defence-in-depth. Pending/orphaned rows in the current scope are picked up bydeleteAll().Compatibility
No migration needed —
pendingandorphanedAtare optional fields, absent on existing rows; readers treat absent/falseas committed.Deferred follow-ups
get()resolves a singleidand reads the value for that id), but a strict snapshot would require write-new-id-then-flip-metadata semantics. Out of scope here.IKvStoragesodeleteAll()is O(scope) instead of O(all rows) across all KV implementations. TodayscopedIds()callsgetAll()and filters in memory; that's fine for the in-memory and small-row backends but suboptimal for SQL-backed KVs.Test plan
bun test packages/storage/src/credentials/ServerCredentialStore.test.ts— existing 10 cases plus 5 new ones (concurrent put, metadata-failure rollback, vault+rollback failure with orphan marker, pending-row invisibility, deleteAll scope isolation).packages/storage/src/credentials/was touched (constraint).IKvStorageorSecretVault.tschange (constraint).Generated by Claude Code