fix(d1-votes): prune orphan vote rows and fail verify on surplus keys#571
fix(d1-votes): prune orphan vote rows and fail verify on surplus keys#571jony376 wants to merge 3 commits into
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR introduces shared vote catalog utilities and refactors sync and verify scripts to detect and prune orphaned vote entries. A new ChangesVote Sync and Verify Orphan Pruning
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 8✅ Passed checks (8 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7464061 to
82e5e06
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
scripts/verify-d1-votes.mjs (2)
21-33:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftVerify
votes_by_clienttoo, not justvotes_entries.This check only inspects
votes_entries. If sync leaves orphaned keys invotes_by_client, verify still reports green even though the D1 vote state is out of sync with the catalog. Given the stated invariant for both tables, verification should compare orphan/missing keys across both datasets.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/verify-d1-votes.mjs` around lines 21 - 33, The current getRows function only queries votes_entries, so update it to fetch both tables and return both result sets for verification: modify the args assembled in getRows (referencing getRows, d1Binding, runMode, and args) to either run two SQL commands (e.g., two SELECTs: one for entry_key, upvote_count FROM votes_entries and one for the relevant columns from votes_by_client such as client_id, entry_key) or a single script that returns both result sets, then adjust the caller to consume both results so the verifier can compare orphan/missing keys across votes_entries and votes_by_client. Ensure the remote/local flag logic using runMode and d1Binding remains unchanged.
48-50:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail on surplus rows, not only shortages.
actualdeduplicates repeatedentry_keys, so a duplicate row for an expected key won't show up inorphans. With the currenttotalRows < totalExpectedcheck, that case passes verification even though D1 still has extra rows. This should fail on any row-count mismatch, not just missing rows.Minimal fix
if ( result.missing.length > 0 || result.negativeCounts.length > 0 || - result.totalRows < result.totalExpected || + result.totalRows !== result.totalExpected || result.orphans.length > 0 ) { failed = true; }Also applies to: 83-87
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/verify-d1-votes.mjs` around lines 48 - 50, The verification currently only fails when totalRows < totalExpected (allowing surplus duplicate rows to slip through); change the check to fail on any row-count mismatch by replacing the strict "less than" check with a strict inequality (totalRows !== totalExpected) using the raw rows.length (variable totalRows) and update the associated error/exit path accordingly; apply the same change to the similar block that appears at the later lines (the other totalRows/totalExpected comparison) so any surplus or shortage causes verification to fail even though the deduplicated Map actual may hide duplicates.
🧹 Nitpick comments (1)
tests/d1-vote-sync.test.ts (1)
8-25: 🏗️ Heavy liftAdd one regression test for the actual sync/verify invariants.
These tests are good for the helper layer, but they stop short of the behavior this PR is fixing. A script-level test that seeds extra keys and asserts prune/verify handle them would catch regressions like orphaned
votes_by_clientrows or duplicate-row false negatives.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/d1-vote-sync.test.ts` around lines 8 - 25, Add a regression test to tests/d1-vote-sync.test.ts that goes beyond unit helpers (findOrphanVoteKeys, enumerateContentVoteKeys) by seeding extra vote keys (including an orphan like "mcp:old-slug" and duplicated rows) into the test DB/store, then invoking the script-level sync/verify entrypoint (the module that performs prune/verify; e.g., the project’s runVoteSync/pruneOrphanVotes/syncAndVerify function) and asserting post-run that orphaned keys were removed and duplicates flagged/handled; use findOrphanVoteKeys and enumerateContentVoteKeys to build expected sets and assertions so the test would catch regressions around orphaned votes_by_client rows and duplicate-row false negatives.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/sync-votes-to-d1.mjs`:
- Around line 89-108: The prune routine only deletes orphan rows from
votes_entries, leaving orphaned rows in votes_by_client; update the loop that
builds the orphan key set (using getVoteEntryKeys and orphans) so the
runWrangler DELETE covers both tables: execute DELETE FROM votes_entries WHERE
entry_key IN (...) and DELETE FROM votes_by_client WHERE entry_key IN (...), or
run a single transaction/command that deletes from both tables using the same
inList; ensure the same quoting logic (key.replaceAll("'", "''")) and the same
runWrangler invocation are reused so both tables are reconciled for runMode
remote/local.
---
Outside diff comments:
In `@scripts/verify-d1-votes.mjs`:
- Around line 21-33: The current getRows function only queries votes_entries, so
update it to fetch both tables and return both result sets for verification:
modify the args assembled in getRows (referencing getRows, d1Binding, runMode,
and args) to either run two SQL commands (e.g., two SELECTs: one for entry_key,
upvote_count FROM votes_entries and one for the relevant columns from
votes_by_client such as client_id, entry_key) or a single script that returns
both result sets, then adjust the caller to consume both results so the verifier
can compare orphan/missing keys across votes_entries and votes_by_client. Ensure
the remote/local flag logic using runMode and d1Binding remains unchanged.
- Around line 48-50: The verification currently only fails when totalRows <
totalExpected (allowing surplus duplicate rows to slip through); change the
check to fail on any row-count mismatch by replacing the strict "less than"
check with a strict inequality (totalRows !== totalExpected) using the raw
rows.length (variable totalRows) and update the associated error/exit path
accordingly; apply the same change to the similar block that appears at the
later lines (the other totalRows/totalExpected comparison) so any surplus or
shortage causes verification to fail even though the deduplicated Map actual may
hide duplicates.
---
Nitpick comments:
In `@tests/d1-vote-sync.test.ts`:
- Around line 8-25: Add a regression test to tests/d1-vote-sync.test.ts that
goes beyond unit helpers (findOrphanVoteKeys, enumerateContentVoteKeys) by
seeding extra vote keys (including an orphan like "mcp:old-slug" and duplicated
rows) into the test DB/store, then invoking the script-level sync/verify
entrypoint (the module that performs prune/verify; e.g., the project’s
runVoteSync/pruneOrphanVotes/syncAndVerify function) and asserting post-run that
orphaned keys were removed and duplicates flagged/handled; use
findOrphanVoteKeys and enumerateContentVoteKeys to build expected sets and
assertions so the test would catch regressions around orphaned votes_by_client
rows and duplicate-row false negatives.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: ad4d8943-139d-4a94-9812-f0285c563847
📒 Files selected for processing (4)
scripts/lib/enumerate-content-vote-keys.mjsscripts/sync-votes-to-d1.mjsscripts/verify-d1-votes.mjstests/d1-vote-sync.test.ts
📜 Review details
🔇 Additional comments (1)
scripts/lib/enumerate-content-vote-keys.mjs (1)
11-37: LGTM!Also applies to: 44-52
| const actualKeys = getVoteEntryKeys(runMode); | ||
| const orphans = actualKeys.filter((key) => !expected.has(key)); | ||
| if (orphans.length === 0) { | ||
| console.log(`${runMode}: no orphan vote rows to prune`); | ||
| return; | ||
| } | ||
|
|
||
| for (let index = 0; index < orphans.length; index += chunkSize) { | ||
| const chunk = orphans.slice(index, index + chunkSize); | ||
| const inList = chunk | ||
| .map((key) => `'${key.replaceAll("'", "''")}'`) | ||
| .join(", "); | ||
| runWrangler([ | ||
| "d1", | ||
| "execute", | ||
| d1Binding, | ||
| runMode === "remote" ? "--remote" : "--local", | ||
| "--command", | ||
| `DELETE FROM votes_entries WHERE entry_key IN (${inList});`, | ||
| ]); |
There was a problem hiding this comment.
Prune both vote tables during reconciliation.
Right now the prune path only deletes from votes_entries. Any orphaned votes_by_client.entry_key rows survive a slug rename/delete, so D1 is still inconsistent and the old vote data stays stranded even after sync. The fix here needs to reconcile both tables off the same orphan key set.
Possible direction
- const actualKeys = getVoteEntryKeys(runMode);
+ const actualKeys = getVoteEntryKeys(runMode);
const orphans = actualKeys.filter((key) => !expected.has(key));
if (orphans.length === 0) {
console.log(`${runMode}: no orphan vote rows to prune`);
return;
}
for (let index = 0; index < orphans.length; index += chunkSize) {
const chunk = orphans.slice(index, index + chunkSize);
const inList = chunk
.map((key) => `'${key.replaceAll("'", "''")}'`)
.join(", ");
runWrangler([
"d1",
"execute",
d1Binding,
runMode === "remote" ? "--remote" : "--local",
"--command",
- `DELETE FROM votes_entries WHERE entry_key IN (${inList});`,
+ [
+ `DELETE FROM votes_by_client WHERE entry_key IN (${inList});`,
+ `DELETE FROM votes_entries WHERE entry_key IN (${inList});`,
+ ].join(" "),
]);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const actualKeys = getVoteEntryKeys(runMode); | |
| const orphans = actualKeys.filter((key) => !expected.has(key)); | |
| if (orphans.length === 0) { | |
| console.log(`${runMode}: no orphan vote rows to prune`); | |
| return; | |
| } | |
| for (let index = 0; index < orphans.length; index += chunkSize) { | |
| const chunk = orphans.slice(index, index + chunkSize); | |
| const inList = chunk | |
| .map((key) => `'${key.replaceAll("'", "''")}'`) | |
| .join(", "); | |
| runWrangler([ | |
| "d1", | |
| "execute", | |
| d1Binding, | |
| runMode === "remote" ? "--remote" : "--local", | |
| "--command", | |
| `DELETE FROM votes_entries WHERE entry_key IN (${inList});`, | |
| ]); | |
| const actualKeys = getVoteEntryKeys(runMode); | |
| const orphans = actualKeys.filter((key) => !expected.has(key)); | |
| if (orphans.length === 0) { | |
| console.log(`${runMode}: no orphan vote rows to prune`); | |
| return; | |
| } | |
| for (let index = 0; index < orphans.length; index += chunkSize) { | |
| const chunk = orphans.slice(index, index + chunkSize); | |
| const inList = chunk | |
| .map((key) => `'${key.replaceAll("'", "''")}'`) | |
| .join(", "); | |
| runWrangler([ | |
| "d1", | |
| "execute", | |
| d1Binding, | |
| runMode === "remote" ? "--remote" : "--local", | |
| "--command", | |
| [ | |
| `DELETE FROM votes_by_client WHERE entry_key IN (${inList});`, | |
| `DELETE FROM votes_entries WHERE entry_key IN (${inList});`, | |
| ].join(" "), | |
| ]); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/sync-votes-to-d1.mjs` around lines 89 - 108, The prune routine only
deletes orphan rows from votes_entries, leaving orphaned rows in
votes_by_client; update the loop that builds the orphan key set (using
getVoteEntryKeys and orphans) so the runWrangler DELETE covers both tables:
execute DELETE FROM votes_entries WHERE entry_key IN (...) and DELETE FROM
votes_by_client WHERE entry_key IN (...), or run a single transaction/command
that deletes from both tables using the same inList; ensure the same quoting
logic (key.replaceAll("'", "''")) and the same runWrangler invocation are reused
so both tables are reconciled for runMode remote/local.
JSONbored
left a comment
There was a problem hiding this comment.
@jony376 this fixes only part of the vote-orphan problem.
- Prune orphan rows from both
votes_entriesandvotes_by_client. - Do not rely only on
ON DELETE CASCADE; make the script explicitly reconcilevotes_by_client.entry_keytoo. - Update
verify-d1-votes.mjsso surplus keys invotes_by_clientfail verification as well as surplus keys invotes_entries. - Add tests covering orphan client-vote rows after a slug rename or deletion.
- Update the branch against current
mainand rerun the D1 vote sync/verify tests. - Expected validation: focused D1 vote tests,
pnpm test, andgit diff --check.
Summary
Prune orphan D1 vote rows during sync and fail verification when surplus keys remain, using a shared enumerator for content vote keys.
Related Issues
Closes #564
Type of Change
Testing
Checklist