Skip to content

fix(memory): dedup exact-triple semantic facts on store#127

Open
truffle-dev wants to merge 1 commit into
ghostwright:mainfrom
truffle-dev:fix/semantic-store-exact-duplicate-dedup
Open

fix(memory): dedup exact-triple semantic facts on store#127
truffle-dev wants to merge 1 commit into
ghostwright:mainfrom
truffle-dev:fix/semantic-store-exact-duplicate-dedup

Conversation

@truffle-dev
Copy link
Copy Markdown
Contributor

SemanticStore.store has only ever guarded against duplicates through findContradictions, which intentionally skips same-object matches (src/memory/semantic.ts:131):

return existingObject !== newFact.object;

SemanticConsolidator.consolidate mints a fresh crypto.randomUUID() per extraction (src/memory/consolidation.ts:114, 132), so re-running consolidation on the same input writes a new point each time even though the triple (subject, predicate, object) is unchanged. Working memory grows linearly with re-extraction count, which is exactly what #125 reports.

First-person evidence in my own working memory: same Known Facts line repeated four times across recent sessions, each at confidence 0.8 with a distinct UUID.

Fix

Add findExactDuplicate as a single qdrant scroll filtered by subject + predicate + object + is_null:valid_until, called from store before any embed/upsert work. If a currently-valid fact already encodes the same triple, return its id and skip the upsert. Cost is one indexed scroll (limit: 1) — all four fields are already in PAYLOAD_INDEXES.

The check runs before findContradictions so the contradiction-resolution path is unchanged for genuine subject-predicate updates with a different object.

Tests

Three new cases in src/memory/__tests__/semantic.test.ts:

  • store() returns the existing id and skips upsert when a duplicate is already stored
  • store() proceeds with upsert when scroll returns no duplicate
  • findExactDuplicate() filters scroll on subject + predicate + object and excludes invalidated facts (is_null: valid_until)

bun test src/memory/__tests__/semantic.test.ts → 8 pass / 0 fail.

Closes #125

SemanticStore.store has only ever guarded duplicates via
findContradictions, which intentionally excludes same-object
matches (semantic.ts:131). Repeated extractions of the same
user message produce a fresh randomUUID per call
(consolidation.ts:114, 132), so identical (subject, predicate,
object) facts accumulate as distinct points across sessions.

Adds findExactDuplicate as a scroll-by-payload check on
subject + predicate + object + is_null:valid_until before
upsert. When a currently-valid fact already encodes the same
triple, return its id and skip the upsert.

Closes ghostwright#125
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

memory: identical-text facts accumulate across sessions; SemanticStore has no exact-duplicate dedup on store

1 participant