feat(memory): reduce corpus write serialization with exact apply by huangruiteng · Pull Request #2210 · volcengine/OpenViking

huangruiteng · 2026-05-24T05:44:21Z

Summary

This PR adds server-side pieces for faster agent-memory corpus preparation while preserving the normal per-trajectory experience semantics and Memory V2 graph consistency.

The practical target is Vaka / TAU-style agent-memory iteration: clients can submit many session commits concurrently, while the server owns file-level safety, patch shape, graph cleanup, retry telemetry, and bounded apply backpressure. This PR no longer includes batch experience consolidation or a batch provider; the acceleration path is now concurrent session commits + concurrent same-session per-trajectory experience phases + operation-exact apply window.

Product behavior changes only for operation-exact phases:

memory.operation_exact_apply_window_seconds now defaults to 10.0s.
Agent experience / trajectory and standard long-term extraction can opt into operation_exact apply locks.
memory.agent_experience_per_trajectory_max_concurrency lets same-session per-trajectory experience phases run concurrently when operation-exact apply is enabled.
memory.long_term_extraction_enabled=false can skip standard long-term user/tool/skill memories while preserving agent memories for agent-memory-only corpus builds.
TAU corpus-build config records and validates corpus_session_commit_concurrency, expected lock modes, long-term extraction mode, and apply-window settings.

Design Boundary

This PR is deliberately not a semantic reconcile system.

Operation-exact phases lock only the concrete files / overviews that extracted operations will apply to.
The apply window briefly queues requests that target overlapping concrete file sets; one owner applies queued operations in arrival order under the union of exact locks.
The window is a scheduling / ordering primitive, not a late reconcile prompt.
Complete string outputs are normalized into structured SEARCH/REPLACE patches before apply whenever possible, so the apply layer can replay a delta on latest file content instead of replacing a whole stale snapshot.
Experience supersedes / delete is treated as a graph rewrite, not ordinary metadata: peer links / backlinks that point at superseded experiences are migrated to the replacement URI, stale old-URI edges are cleaned before delete, deleted-link endpoints are included in operation-exact lock/write sets, superseded target reads are version-tracked so late endpoint drift triggers stale-read retry, and supersedes is only consumed after the target is resolved.

The intended contract is: clients may be aggressive about concurrency; server-side Memory V2 should provide safe apply semantics and telemetry without asking every client to implement its own serial discipline.

What Changed

Adds operation-exact apply modes for agent experiences, agent trajectories, and standard long-term memory extraction.
Adds memory.operation_exact_apply_window_seconds, default 10.0.
Adds memory.agent_experience_per_trajectory_max_concurrency, default 4.
Converts complete string outputs into structured StrPatch(blocks=[...]) for both merge_op=patch and merge_op=replace string fields when old field content is available.
Lets MemoryUpdater apply a structured string patch through PatchOp even if the schema field is normally merge_op=replace.
Keeps replace/delete/unknown/plain unstructured operations conflict-sensitive, while allowing structured string patches and safe merge ops to apply against latest content.
Migrates replacement peer links / backlinks, cleans remaining old-URI edges for delete operations, includes deleted-link endpoints in operation-exact lock/write sets, and tracks superseded target reads for exact-apply stale retry.
Resolves experience supersedes as a graph rewrite:
- resolved targets are queued for delete through the normal updater path;
- source trajectory links are inherited onto the superseding experience;
- other inherited graph links/backlinks are rewritten from the superseded URI to the replacement URI, with same-replacement self-links dropped;
- prefetched targets that disappear trigger operation-exact retry;
- never-resolvable targets mark the operation invalid instead of silently creating a near-duplicate;
- comma / semicolon / newline separated multi-target supersedes strings are supported, while an exact raw name is tried first so valid filenames containing separators still work.
Adds memory.long_term_extraction_enabled, default true.
Extends TAU benchmark config, preflight, generated commands, and corpus manifests to record and verify expected memory config.
Adds phase telemetry for conflict-sensitive buckets/reasons, conflicts, retries, structured string conversion, apply-window leader/follower/wait signals, and per-trajectory experience concurrency.
Adds read-only GET /api/v1/stats/memory-graph to let clients inspect memory graph health after concurrent corpus writes, including memory type counts, source links, backlinks, broken endpoints, missing reciprocal links, and violation samples.
Adds async/sync local and HTTP client helpers for the same memory graph health summary, so corpus runners can gate on graph integrity without issuing raw stats HTTP calls.

Validation Signal

Small Throughput Probe

Small TAU-2 retail corpus-prepare probe, cached train transcripts, 8 successful sessions, wait timeout 3600s.

Phase times below are server telemetry attribution. They should not be added to task wall time directly, and the tree-control run did not record a complete other phase. The tree row therefore reports the lock bucket that explains the missing other bottleneck.

mode	tasks	total task duration	experience phase	trajectory phase	long-term / `other` phase	read
tree control	8/8	sum 3750.6s, max 748.3s	62.7s, 5 calls, 0 retry	942.5s, incl. 691.2s tree wait	phase not recorded; tree wait mostly tools+skills=2982.7s	baseline directory-lock control
all exact before merge-safe stale	8/8	sum 3221.1s, max 884.5s	110.8s, 7 calls, 0 retry / 0 conflict	262.4s, 8 calls, 0 retry / 0 conflict	3199.1s, 26 calls, 18 retries / 44 conflicts	stale retries moved cost from lock wait to LLM rerun
all exact + merge-safe stale	8/8	sum 2608.4s, max 637.1s	110.0s, 7 calls, 0 retry / 0 conflict	200.7s, 8 calls, 0 retry / 0 conflict	2582.1s, 22 calls, 14 retries / 41 conflicts	remaining conflicts all tools + plain_string_patch

Read: this is a corpus-prepare throughput signal, not a benchmark-score claim. Experience and trajectory phases were already clean in this probe; the long tail was standard long-term memory extraction, especially tools updates that were still emitted as plain-string patches. The conversion layer addresses that patch-shape problem directly; the latest patch also removes batch consolidation and instead allows same-session per-trajectory experience phases to run concurrently.

Full Retail Corpus Graph Check

Cached TAU-2 retail train transcripts, success-only agent-memory corpus build, corpus_session_commit_concurrency=4, operation_exact_apply_window_seconds=10.0, long-term extraction disabled.

run	committed / skipped	experiences	trajectories	links / backlinks	broken endpoints	missing backlinks	lingering `supersedes`	exp without source link	read
pre-heal full run	59 / 15	118	143	370 / 367	0	3	0	0	graph endpoints existed, but three source-link backlinks were missing
patched full run	59 / 15	108	135	278 / 278	0	0	0	0	final links/backlinks are balanced; no duplicate experience stems

The full patched run validates the main invariant: source lineage and replacement cleanup are handled by the server apply path, not by best-effort post-apply metadata edits.

That full run also surfaced two multi-target supersedes strings from the extractor, for example one field naming several older experience candidates. The latest commit adds server-side parsing for that edge: try the exact raw target first, then split comma/semicolon/newline candidates; resolve every valid target; inherit all source trajectory links; and only treat the operation as invalid when no target can be resolved. This multi-target parser and replacement-link migration are covered by targeted tests; I did not rerun the full 59-session corpus after these small graph-rewrite follow-ups. The latest follow-up also records the superseded file base digest during graph rewrite, so if the old card gains new links before lock/apply, exact apply retries with a refreshed cleanup plan rather than mutating an endpoint that was not part of the original lock set.

TAU Runner Contract

The TAU runner now supports and records the intended client-side write mode:

memory:
  agent_memory_enabled: true
  agent_experience_apply_lock_mode: operation_exact
  agent_trajectory_apply_lock_mode: operation_exact
  long_term_apply_lock_mode: operation_exact
  operation_exact_apply_window_seconds: 10.0
  long_term_extraction_enabled: false
openviking:
  corpus_session_commit_concurrency: 4

--strict-preflight checks the running OpenViking config before a matrix run. The cached corpus manifest records corpus_session_commit_concurrency, corpus_prepare_mode, stable input-order rows, and the expected OpenViking memory config so mismatched reruns fail fast instead of silently reusing an incompatible corpus.

Follow-up / Open Questions

Validate the apply-window owner path on a larger full-corpus prepare with the batch provider removed and same-session per-trajectory concurrency enabled.
Decide whether tools / skills should emit structured patches directly instead of relying on the compatibility conversion bridge.
Replace free-text supersedes names with stable candidate URI/id when the extractor can expose replace candidates safely.
Keep staleness as telemetry / evaluation-policy language for now: record source policy, patch base, apply version, window wait/order, and retry/conflict reason before turning it into a product default budget.

Tests

Latest validation:

source /Users/bytedance/Documents/agent-harness/scripts/load_local_env.sh && uv run --with ruff ruff check openviking/session/compressor_v2.py tests/session/memory/test_compressor_v2.py openviking/session/memory/memory_updater.py tests/session/memory/test_memory_updater.py
source /Users/bytedance/Documents/agent-harness/scripts/load_local_env.sh && uv run --with ruff ruff format --check openviking/session/compressor_v2.py tests/session/memory/test_compressor_v2.py openviking/session/memory/memory_updater.py tests/session/memory/test_memory_updater.py
OPENVIKING_CONFIG_FILE=tests/api_test/ov.conf.template .venv/bin/python -m pytest tests/session/memory/test_compressor_v2.py::test_source_trajectory_links_attach_before_exact_lock tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_tracks_deleted_file_version_for_exact_retry tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_consumes_field_only_after_resolved tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_migrates_peer_links_to_replacement_uri tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_accepts_multiple_replaced_experiences tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_keeps_partial_multi_target_resolution tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_prefers_exact_name_before_splitting tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_retries_when_prefetched_target_disappears tests/session/memory/test_compressor_v2.py::test_resolve_supersedes_unresolved_target_marks_operations_invalid tests/session/memory/test_memory_updater.py::TestMemoryUpdater::test_apply_operations_cleans_peer_backlinks_before_delete tests/session/memory/test_memory_updater.py::TestMemoryUpdater::test_apply_operations_migrates_replacement_links_and_cleans_old_uri tests/session/memory/test_memory_updater.py::TestMemoryUpdater::test_apply_operations_heals_preserved_forward_links_on_upsert tests/session/memory/test_memory_updater.py::TestMemoryUpdater::test_apply_operations_cleans_links_added_after_delete_snapshot (14 passed)
.venv/bin/python -m compileall -q openviking/session/compressor_v2.py tests/session/memory/test_compressor_v2.py openviking/session/memory/memory_updater.py tests/session/memory/test_memory_updater.py
OPENVIKING_CONFIG_FILE=$(mktemp-empty-config) .venv/bin/python -m pytest tests/server/test_api_stats_memory_graph.py -q (2 passed)
OPENVIKING_CONFIG_FILE=tests/api_test/ov.conf.template .venv/bin/python -m pytest tests/server/test_api_stats_memory_graph.py tests/client/test_rebuild_clients.py -q (12 passed)

github-actions · 2026-05-24T07:07:21Z

Persistent review updated to latest commit 2f09aa9

github-actions · 2026-05-24T07:10:05Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Add max retry limit for version conflicts Add a maximum retry limit for operation-exact version conflicts to prevent potential infinite loops. Use a config setting similar to `v2_lock_max_retries`. openviking/session/compressor_v2.py [1058-1075] if not _operation_exact_retry_driver: next_attempt = operation_exact_version_attempt + max_retries = getattr(config.memory, "v2_version_conflict_max_retries", 10) while True: + if next_attempt > max_retries: + raise RuntimeError(f"[{phase_label}] Exceeded maximum version conflict retries ({max_retries})") result = await self._run_extract_phase( provider=provider, messages=messages, ctx=ctx, strict_extract_errors=strict_extract_errors, phase_label=phase_label, post_apply=post_apply, force_tree_lock=force_tree_lock, operation_exact_version_attempt=next_attempt, _operation_exact_retry_driver=True, ) if isinstance(result, _OperationExactRetrySignal): next_attempt = result.next_attempt continue return result Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies a potential infinite loop risk from unbounded version conflict retries, which is a valid concern. However, the provided improved code would cause a NameError because `config` is not defined before it's used in the outer retry loop (config is fetched later in the method). The core idea is sound, but the implementation needs adjustment to fetch config earlier.	Low

huangruiteng added 28 commits May 23, 2026 02:13

feat: batch agent experience consolidation

f690c88

style: format batch experience tests

f33643e

test: cover batch experience chunk sizing

309f916

fix: derive batch experience prompt from single provider

bf6acfb

style: format batch experience prompt adapter

098e50b

feat(memory): align experience prompt with atomic intent

f0de40e

fix(memory): match atomic experience prompt archive

f8dacfa

fix: satisfy batch experience lint

76ebbf6

fix: preserve batch experience granularity

d13ccbf

style: format batch experience test

dc03db1

feat: expose agent memory phase telemetry

941fa18

feat: surface commit telemetry in benchmark manifests

abe602b

fix: preserve batch action boundaries

8cce894

feat: audit experience corpus quality

3fe7bc5

chore: trim batch experience diagnostics

223a473

chore: tighten batch prompt adapter

cf6a7c7

chore: default TAU corpus prep to batch mode

beaf4b9

chore: format TAU batch eval config

1ae63ba

feat: prototype exact-lock experience apply

f581f90

feat: retry stale experience exact-lock applies

e81af24

fix: limit read version tracking to experiences

5ecc6ac

chore: expose experience exact-lock target diagnostics

bb72a75

chore: make TAU OpenViking wait timeout explicit

454d760

chore: raise default TAU OpenViking wait timeout

f385e1d

chore: expose memory phase lock plans

fe2c33e

chore: expose lock acquire bucket telemetry

165c22c

feat: add exact apply mode for agent trajectories

ee54577

feat: add operation-exact long-term apply diagnostics

533f03c

github-project-automation Bot added this to OpenViking project May 24, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 24, 2026

github-actions Bot added the Review effort 4/5 label May 24, 2026

huangruiteng added 3 commits May 24, 2026 18:48

feat(memory): report stale-read telemetry

d9ec156

feat(memory): allow agent-only memory extraction

9d14920

fix(tau2): validate cached memory config

afa092d

qin-ctx requested a review from chenjw May 25, 2026 03:26

feat(memory): convert plain patches and commit corpora concurrently

1c0a8b8

huangruiteng changed the title ~~feat(memory): add experimental exact-apply mode for long-term memory~~ feat(memory): reduce corpus write serialization with exact apply May 25, 2026

huangruiteng added 21 commits May 25, 2026 18:21

docs(tau2): default corpus prepare to efficient agent memory writes

4937d8b

docs(tau2): use exact apply as corpus prepare default

7676788

feat(memory): add operation exact apply window

2c84e89

chore(tau2): use engineering apply window

08652a4

feat(memory): enable default exact apply window

bb01758

style(memory): format exact apply changes

5c22e32

Merge latest main into memory versioned apply

27ebda9

chore(memory): set exact apply window default to ten seconds

be07939

chore(memory): fix lint after merge

adf369c

feat(memory): use per-trajectory concurrency for exact apply

096ff99

fix(memory): clean graph links on superseded deletes

95c8a60

fix(memory): retry stale supersedes replacements

f210f8e

fix(memory): include source links in exact apply

a4a5e78

fix(memory): heal preserved graph links on upsert

0d2fe59

fix(memory): resolve multi-target supersedes replacements

9c39d8d

fix(memory): migrate replacement graph links

66b8d18

style(memory): format supersedes tests

9cdbb1c

fix(memory): track supersedes reads for exact cleanup

972ddce

feat(memory): expose memory graph health stats

db808bf

feat(client): expose memory graph health helper

47f0aee

test(memory): align graph rewrite expectations

7974e06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): reduce corpus write serialization with exact apply#2210

feat(memory): reduce corpus write serialization with exact apply#2210
huangruiteng wants to merge 54 commits into
volcengine:mainfrom
huangruiteng:feat/memory-versioned-apply

huangruiteng commented May 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 24, 2026

Uh oh!

github-actions Bot commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

huangruiteng commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Boundary

What Changed

Validation Signal

Small Throughput Probe

Full Retail Corpus Graph Check

TAU Runner Contract

Follow-up / Open Questions

Tests

Uh oh!

github-actions Bot commented May 24, 2026

Uh oh!

github-actions Bot commented May 24, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

huangruiteng commented May 24, 2026 •

edited

Loading