fix(memory): dedup exact-triple semantic facts on store#127
Open
truffle-dev wants to merge 1 commit into
Open
Conversation
SemanticStore.store has only ever guarded duplicates via findContradictions, which intentionally excludes same-object matches (semantic.ts:131). Repeated extractions of the same user message produce a fresh randomUUID per call (consolidation.ts:114, 132), so identical (subject, predicate, object) facts accumulate as distinct points across sessions. Adds findExactDuplicate as a scroll-by-payload check on subject + predicate + object + is_null:valid_until before upsert. When a currently-valid fact already encodes the same triple, return its id and skip the upsert. Closes ghostwright#125
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SemanticStore.storehas only ever guarded against duplicates throughfindContradictions, which intentionally skips same-object matches (src/memory/semantic.ts:131):SemanticConsolidator.consolidatemints a freshcrypto.randomUUID()per extraction (src/memory/consolidation.ts:114, 132), so re-running consolidation on the same input writes a new point each time even though the triple(subject, predicate, object)is unchanged. Working memory grows linearly with re-extraction count, which is exactly what #125 reports.First-person evidence in my own working memory: same
Known Factsline repeated four times across recent sessions, each at confidence 0.8 with a distinct UUID.Fix
Add
findExactDuplicateas a single qdrantscrollfiltered bysubject+predicate+object+is_null:valid_until, called fromstorebefore any embed/upsert work. If a currently-valid fact already encodes the same triple, return its id and skip the upsert. Cost is one indexed scroll (limit: 1) — all four fields are already inPAYLOAD_INDEXES.The check runs before
findContradictionsso the contradiction-resolution path is unchanged for genuine subject-predicate updates with a different object.Tests
Three new cases in
src/memory/__tests__/semantic.test.ts:store()returns the existing id and skips upsert when a duplicate is already storedstore()proceeds with upsert when scroll returns no duplicatefindExactDuplicate()filters scroll onsubject+predicate+objectand excludes invalidated facts (is_null: valid_until)bun test src/memory/__tests__/semantic.test.ts→ 8 pass / 0 fail.Closes #125