feat(memory): reconcile stale exact-lock updates by huangruiteng · Pull Request #2335 · volcengine/OpenViking

huangruiteng · 2026-05-30T22:45:07Z

Summary

This PR adds a narrow, opt-in write path for concurrent memory updates:

wrap runtime StrPatch values with the field value they were generated against (StrPatchWithBase)
wrap runtime string replacements with the field value they were generated against (ReplaceValueWithBase)
acquire exact file locks during memory apply, then read the latest target file before merging
rewrite stale SEARCH/REPLACE patches against the latest field content when configured
synthesize stale string replacements against the latest field content when configured
fail fast in exact-file-lock mode when the exact lock, latest read, merge/rewrite/synthesis, or endpoint link update cannot be applied safely

It also carries the minimal validation surface needed to run this end to end:

read-only memory graph health inspection under /api/v1/stats/memory-graph, exposed through local/http sync/async clients
graph-health signals for quality gates: link/backlink consistency, source-link coverage, schema-heading completeness, and lightweight near-duplicate/source-density diagnostics
TAU runner support for configurable concurrent corpus session commits plus cache identity checks
corpus gates for committed sessions, extracted memories, concrete scoped search/read, graph health, retrieval trace coverage, and completed scoreboards
corpus provenance in manifests: train results sha256, TAU-2 commit, OpenViking commit, and OpenViking config path/hash

Scope boundary

Included in this PR:

same-URI patch + StrPatch stale SEARCH/REPLACE rewrite primitive
same-URI string replace stale synthesis primitive; stale full-value proposals must be synthesized against latest content or fail fast
exact-file-lock operation-level fail-fast safety
exact-file-lock apply runtime-scoped to agent memory extraction for trajectories and experiences
legacy plain-string patch in the agent exact path interpreted as a base-aware full replacement, not as substring patch
agent-only validation config via memory.long_term_extraction_enabled=false, memory.agent_memory_enabled=true, and default/off memory.session_skill_extraction_enabled=false
deterministic transcript-local delta handling for tool/skill SUM counters
generic delete-time StoredLink cleanup for peer links/backlinks
per-phase memory_diff_<phase>.json archives with apply_trace fields (uri, field, merge_op, input/wrapper shape, stale/rewrite status, applied/failed status)
TAU corpus-build runner plumbing for commit concurrency, provenance manifesting, and manifest/cache validation

Not included here:

exact-file-lock apply for standard long-term user memory, tool/skill memory, or session_skills
non-string replace synthesis / typed schema merge
tools/skills exact-file-lock adapter
cross-URI create-new semantic consolidation/admission
session skill asset adapter/lifecycle validation
failure-experience retrieval or prompt changes
a TAU reward/effect uplift claim, a full-C10 recommended-default claim, or a linear wall-time improvement claim

The new memory-write behavior is off by default behind:

memory.memory_apply_exact_file_lock_enabled
memory.stale_patch_rewrite_enabled

TAU corpus commit concurrency also defaults to serial (1) unless explicitly configured.

Validation

Latest GitHub checks at head 9c22aebe:

check-deps: passed
lint / lint: passed
API & CLI Integration Tests (ubuntu-24.04): passed
pr_review: passed
build: skipped

Targeted local validation on this branch includes:

tests/session/memory/test_memory_timestamp_parsing.py tests/session/memory/test_memory_updater.py tests/session/memory/test_merge_ops.py
- Result after duplicate-delete cleanup: 109 passed
targeted stale rewrite / stale replacement / stale delete trace tests in tests/session/memory/test_memory_updater.py and tests/session/memory/test_merge_ops.py
session-skill / agent-only config gates, extract-loop match-text, tool/skill transcript-local deltas, schema exact-boundary tests, compressor exact-boundary tests, TAU runner/analyzer gates, and phase memory-diff trace coverage
uvx ruff and git diff --check on touched runtime/test files: passed in the final local validation sequence

Full-train corpus gate at head 9c22aebe:

Exact C4 corpus artifact: /private/tmp/pr2335_9ef7e46d_full_train_exact_c4_deleteskip_20260602_010530
Frozen tree C1 corpus artifact: /private/tmp/pr2335_fe114b20_full_train_tree_c1_20260601_231521
Four corpus manifests (retail/airline x exact/tree) passed strict_corpus_gate with claim_valid=true
All four corpora had healthy graph checks: source-linkless, broken endpoint, missing backlink, missing forward link, parse error, and missing required heading counts were all 0
Scoped search/read probes returned 10 hits with 2 non-empty concrete reads for each corpus

Full-train corpus shape:

Corpus	Sessions committed / skipped	Memories extracted	Experiences	Trajectories	Source avg	Single-source	Near-dup note
exact retail C4	63 / 11	691	78	180	5.90	29.49%	name-similar pairs 113
tree retail C1	63 / 11	670	62	174	7.11	33.87%	name-similar pairs 49
exact airline C4	26 / 4	136	25	45	3.44	36.00%	content-similar pairs 1
tree airline C1	26 / 4	133	32	44	2.75	50.00%	content-similar pairs 3

Weighted source-density read:

exact: 103 experiences / 546 source links / source avg 5.30 / single-source 31.1%
tree: 94 experiences / 529 source links / source avg 5.63 / single-source 39.4%

Read: source-density and near-dup are measured and did not hit graph/schema hard failures. They are still a quality boundary, especially for exact-retail semantic fragmentation / create-new consolidation. This PR should not be read as semantic applicability parity.

Full held-out r4 eval using the frozen full-train corpora:

Exact eval artifact: /private/tmp/pr2335_9c22aebe_full_eval_plan_20260602_032646/exact_result/exact_full_eval_r4_exec_20260602_033640
Tree eval artifact: /private/tmp/pr2335_9c22aebe_full_eval_plan_20260602_032646/tree_result/tree_full_eval_r4_exec_20260602_041900
Both runs: retail + airline, full held-out test split, 4 repeats, 8/8 cells completed, 0 failed, 240 simulations

Mode	Retail reward / DB	Airline reward / DB	Task-weighted reward	Task-weighted DB	Retrieval aggregate
exact	0.81875 / 0.81875	0.78750 / 0.81250	0.80833	0.81667	513 events / 1026 injected / 1026 non-empty reads
tree	0.84375 / 0.84375	0.75000 / 0.76250	0.81250	0.81667	482 events / 964 injected / 964 non-empty reads

Same-protocol exact vs tree:

task-weighted reward delta: -0.0042
task-weighted DB delta: 0.0
retail exact lower by 0.025 reward/DB
airline exact higher by 0.0375 reward and 0.05 DB
all exact/tree summaries had corpus_evidence.claim_valid=true and effect_evidence.claim_valid=true

Read: this supports no catastrophic scoreboard/corpus/retrieval-plumbing regression for the PR-1 exact path. It does not claim reward uplift or semantic relevance of every retrieved memory.

Apply trace summary on full-train corpus writes:

Mode	Apply trace rows	Applied	Skipped stale-deleted	Stale detected / rewrite attempted
exact C4	2062	2056	6	87 / 87
tree C1	1992	1992	0	0 / 0

Read: stale/delete behavior is recorded explicitly instead of being hidden as success. Rewrite latency/cost and full-C10 stress rewrite rate remain follow-up telemetry, not claims in this PR.

Claim boundary

This is a common server-side write-safety primitive, not a TAU reward claim. The validated runtime scope is agent trajectories / experiences under the agent-only extraction config. The full-train and full held-out r4 evidence supports corpus health, graph/schema health, retrieval plumbing, and non-catastrophic score regression versus the frozen tree-lock baseline. It does not claim exact-file-lock coverage for standard long-term user memory, tools/skills, or session_skills; it does not solve cross-URI create-new duplicates or experience granularity drift; and it does not claim full-C10 wall-time linear improvement.

github-actions · 2026-05-30T22:46:49Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 90
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Extract: Skip schema tree lock when exact file lock mode is enabled Relevant files: openviking/session/compressor_v2.py tests/session/memory/test_compressor_v2.py Sub-PR theme: Memory apply: Exact file lock + stale patch rewrite Relevant files: openviking/session/memory/memory_updater.py openviking/session/memory/merge_op/base.py openviking/session/memory/merge_op/patch.py openviking_cli/utils/config/memory_config.py tests/session/memory/test_memory_updater.py tests/session/memory/test_merge_ops.py
⚡ Recommended focus areas for review Minor code duplication Duplicate `_text_digest` function exists in both `memory_updater.py` and `patch.py`. Consider extracting to a shared utility module for consistency. def _text_digest(value: Optional[str]) -> Optional[str]: if value is None: return None return hashlib.sha256(value.encode("utf-8")).hexdigest() LLM resilience considerations The `_rewrite_stale_patch` function calls `StructuredLLM().complete_model_async` without explicit timeout/retry configuration in the visible code. Ensure the underlying LLM client has proper resilience for production use (this may already be handled by `StructuredLLM` internally). async def _rewrite_stale_patch( *, current_value: str, patch_value: StrPatchWithBase, error: Exception, ) -> Optional[StrPatch]: """Ask the configured LLM to rewrite a stale string patch for latest content.""" from openviking_cli.utils.llm import StructuredLLM original_blocks = [ {"search": block.search, "replace": block.replace} for block in patch_value.blocks ] prompt = ( "Rewrite a stale SEARCH/REPLACE patch so it applies to the latest memory field.\n" "Return ONLY the JSON schema requested by the caller.\n\n" "Rules:\n" "- Preserve the user's intended edit from the original patch.\n" "- Use exact SEARCH text copied from the latest field value.\n" "- Do not include line-number prefixes unless they are truly part of the field.\n" "- If no safe rewrite exists, return an empty blocks list.\n\n" f"Patch base value:\n{patch_value.base_value or ''}\n\n" f"Latest field value:\n{current_value}\n\n" f"Original patch blocks:\n{original_blocks}\n\n" f"Patch error:\n{error}" ) rewritten = await StructuredLLM().complete_model_async(prompt, StrPatch) if rewritten and rewritten.blocks: return rewritten return None

github-actions · 2026-05-30T22:47:57Z

PR Code Suggestions ✨

No code suggestions found for the PR.

…-patch-rewrite-pr1

feat(memory): add exact-lock stale patch rewrite

bfde107

github-project-automation Bot added this to OpenViking project May 30, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 30, 2026

huangruiteng added 7 commits May 31, 2026 07:07

fix(memory): fail fast on exact-lock read and link errors

89518bb

style(memory): format exact-lock changes

7866b7f

feat(memory): add graph health validation hooks

6c61bcf

fix(memory): clean graph links on file delete

8153b78

feat(memory): gate exact apply by merge semantics

533cefc

style(memory): format exact-lock test

fa4f5e4

test(server): isolate embedded service config

8b3c04c

huangruiteng changed the title ~~feat(memory): add exact-lock stale patch rewrite~~ feat(memory): reconcile stale exact-lock updates May 31, 2026

huangruiteng added 16 commits June 1, 2026 00:10

feat(memory): synthesize stale string replacements

aac25bd

fix(memory): require read base for exact replacement

4c82f4a

fix(memory): harden exact agent apply boundary

7807e18

test(tau2): record corpus provenance in memory eval

7ce2c1f

test(memory): cover session skill extraction switch

26ca37b

feat(memory): write agent phase memory diffs

d37c0df

Merge remote-tracking branch 'origin/main' into feat/memory-file-lock…

7c57fee

…-patch-rewrite-pr1

style(session): sort imports after main merge

fb5e856

test(memory): cover stale rewrite apply traces

c66b5eb

test(memory): keep session skill fake phase signature current

eb35705

fix(memory): preserve subsecond trajectory timestamp keys

f1936b7

fix(memory): dedupe duplicate deletes in apply batch

fe114b2

fix(memory): skip stale exact upsert races

afea3b5

fix(memory): treat missing exact deletes as stale

9ef7e46

style(memory): format exact stale delete telemetry

9c22aeb

refactor(memory): share exact-lock text digest helper

42da595

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): reconcile stale exact-lock updates#2335

feat(memory): reconcile stale exact-lock updates#2335
huangruiteng wants to merge 24 commits into
volcengine:mainfrom
huangruiteng:feat/memory-file-lock-patch-rewrite-pr1

huangruiteng commented May 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

huangruiteng commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope boundary

Validation

Claim boundary

Uh oh!

github-actions Bot commented May 30, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented May 30, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

huangruiteng commented May 30, 2026 •

edited

Loading