[WIP] 测试中 - WM v2 Memory Preprocessing by jcp0578 · Pull Request #2131 · volcengine/OpenViking

jcp0578 · 2026-05-19T15:16:07Z

Description

Introduce WM v2 memory preprocessing so long session payloads can be compacted before the WM update step, then refine activation thresholds and rendering strategy to keep the behavior conservative when savings are marginal.

Related Issue

N/A

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test update

Changes Made

Add ExtractionPreprocessor to the WM v2 update path so full session messages can be distilled into a compact packet before the LLM update step.
Extract structured signals and select spans with MMR-based deduplication to preserve high-value context while reducing token cost.
Add conservative fallback paths for short sessions, weak compaction, risky sessions, and failed tool-selection cases.
Add adaptive preprocessing parameters so span budget and fact caps scale with session size.
Introduce tiered compact rendering formats for small, medium, and large sessions to improve activation on mid-sized inputs.
Align CREATION and UPDATE span budgets so preprocessing can activate consistently during first-time WM generation.
Add a minimum absolute savings threshold so sessions with very small estimated gains fall back instead of compacting on noisy token estimates.
Keep the feature wired only into the WM v2 update path; WM creation flow outside the span-budget alignment and long-term extraction behavior remain unchanged.

Testing

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have tested this on the following platforms:
- Linux
- macOS
- Windows

Tested locally / referenced in branch commits:

LoCoMo small用例测试

模式	轮次	Accuracy	正确/总数	Gateway Ingest Tokens	Gateway QA Tokens	OV Ingest LLM Tokens	Combined Total Tokens	每成功任务 Token
OFF	1	88.57%	31 / 35	70,402	759,712	114,015	944,129	30,455.77
OFF	2	85.71%	30 / 35	70,349	755,764	106,295	932,408	31,080.27
OFF	3	91.43%	32 / 35	70,472	772,242	96,306	939,020	29,344.38
ON	1	85.71%	30 / 35	70,762	661,930	88,601	821,293	27,376.43
ON	2	97.14%	34 / 35	70,259	774,898	96,563	941,720	27,697.65
ON	3	97.14%	34 / 35	70,840	796,350	97,856	965,046	28,383.71

按模式汇总平均值

模式	平均 Accuracy	平均正确题数	平均 Gateway Ingest	平均 Gateway QA	平均 OV Ingest LLM	平均 Combined	平均每成功任务 Token
OFF	90.48%	31.00	70,407.67	762,572.67	105,538.67	938,519.00	30,293.47
ON	93.33%	32.67	70,620.33	744,392.67	94,340.00	909,353.00	27,819.26

LoCoMo sample0用例测试中

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

The preprocessing logic is intentionally conservative. If compaction is too small to be confidently beneficial, the flow falls back to the full-message path instead of risking information loss for marginal token savings.

Insert an ExtractionPreprocessor before LLM in WM v2 update path that compresses full session messages into a compact packet using rule-based signal extraction and MMR-based span deduplication. Architecture: raw messages -> rule extraction (structured_facts) + MMR span selection -> compact packet -> LLM WM update Safety nets: - Short sessions (<600 tokens) -> session_too_short fallback - Compact not smaller enough -> compact_not_smaller_enough fallback - High risk sessions -> auto expand budget 1.5x or fallback - Failed tool messages not selected -> fallback - Full archive always preserved via ov_archive_search Signal extraction covers: errors, corrections, preferences, dates, goals, open issues, paths, URLs, functions, plugins, recall, fallback, components Truncation uses paragraph-aware extraction instead of naive head-truncation, preserving the most information-dense paragraphs within the char budget. Config (all default-off): wm_v2_preprocess_enabled, wm_v2_preprocess_max_span_tokens, wm_v2_preprocess_fallback_ratio Only wired into WM v2 update path; creation and long-term extraction unchanged. Tests: 26 preprocessor unit + 19 fixture scenarios = 133 total passing (with existing WM v2 guard/growth tests). Co-Authored-By: deepseekV4-pro <noreply@deepseek.com>

github-actions · 2026-05-19T15:17:37Z

PR Reviewer Guide 🔍

(Review updated until commit `f34913b`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 80
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: WM v2 Memory Preprocessing Relevant files: openviking/session/extraction_preprocessor.py openviking/session/session.py openviking_cli/utils/config/memory_config.py tests/unit/session/test_extraction_preprocessor.py tests/unit/session/test_fixture_token_savings.py Sub-PR theme: Auto-recall Improvements Relevant files: examples/openclaw-plugin/auto-recall.ts examples/openclaw-plugin/context-engine.ts examples/openclaw-plugin/process-manager.ts
⚡ Recommended focus areas for review No-op change The to_dict method was changed to the same line, which is a no-op. This might be a mistake or a whitespace change. def to_dict(self) -> Dict[str, Any]: """Convert configuration to dictionary.""" return self.model_dump() Forced recall fallback The code now forces a recall fallback even when ctx is available, which might change behavior for non-/v1/responses flows. This is noted as a diagnostic/workaround, but should be verified. // entered but the transform-context recall path is never reached. const forcedRecallFallback = await tryMainPathRecallFallback("main_force"); if (forcedRecallFallback) { return forcedRecallFallback; }

github-actions · 2026-05-19T15:20:05Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Replace bare except with specific error handling The bare `except: pass` silently swallows all exceptions, making debugging difficult. Replace it with a specific exception handler that logs the error (even if continuing execution) to improve visibility into issues. benchmark/locomo/data/build_codex_benchmark.py [25-26] -except: - pass +except json.JSONDecodeError as e: + print(f"Warning: Failed to decode JSON line: {e}", file=sys.stderr) + continue +except Exception as e: + print(f"Warning: Unexpected error processing line: {e}", file=sys.stderr) + continue Suggestion importance[1-10]: 5 __ Why: The suggestion addresses a poor practice (bare `except: pass`) by adding specific error handling with logging. This improves debuggability without changing the core functionality, making it a moderate-impact improvement.	Low

Change creation_span_budget * 2 to creation_span_budget in session.py, matching the UPDATE path's span budget (1200). The 2x multiplier was causing compact packets to be larger than full messages on CREATION, preventing the preprocessor from ever triggering ACTIVE on first-time WM generation. End-to-end verification with a 150-message real Codex session confirms: - Preprocessor achieves 40% token savings (50K→30K) in ACTIVE mode - Accuracy ON (45%) >= OFF (40%), no quality regression - QA input tokens reduced 70%, total tokens reduced 74% Co-Authored-By: deepseekV4-pro <noreply@deepseek.com>

- Add _resolve_adaptive_options() to scale span budget and facts cap with session size (P0 optimization) - Implement three-tier rendering format: Tier 1 (<2K tokens): ultra-compact # WM-Compact with inline facts Tier 2 (2K-8K tokens): moderate format with reduced headers Tier 3 (>8K tokens): full format (existing behavior) - Middle sessions (~5K tokens) now trigger ACTIVE instead of FALLBACK Co-Authored-By: deepseekV4-pro <noreply@deepseek.com>

…rginal compacting Sessions where compact saves <500 tokens are now FALLBACK ("savings_too_small") even when the ratio check passes. Token estimation uses ceil(len/4) heuristic with inherent noise — small absolute savings are within estimation error and don't justify compaction's information-loss risk. Co-Authored-By: deepseekV4-pro <noreply@deepseek.com>

jcp0578 · 2026-05-19T15:43:31Z

benchmark/locomo/data/* have been removed from the PR after the latest force-push, so earlier auto-generated review notes referencing those files are now stale.

github-actions · 2026-05-20T10:10:29Z

Persistent review updated to latest commit f34913b

github-actions · 2026-05-20T10:12:16Z

PR Code Suggestions ✨

No code suggestions found for the PR.

…ose savings threshold

…illation

github-project-automation Bot added this to OpenViking project May 19, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 19, 2026

github-actions Bot added the Review effort 4/5 label May 19, 2026

jcp0578 changed the title ~~记忆预处理~~ WM v2 Memory Preprocessing May 19, 2026

jcp0578 changed the title ~~WM v2 Memory Preprocessing~~ [WIP]WM v2 Memory Preprocessing May 19, 2026

jcp0578 force-pushed the feat/wm-v2-token-distillation branch 3 times, most recently from f7bb425 to 7ce8372 Compare May 19, 2026 15:32

jcp0578 and others added 3 commits May 19, 2026 23:32

jcp0578 force-pushed the feat/wm-v2-token-distillation branch from 7ce8372 to a0a8670 Compare May 19, 2026 15:33

jcp0578 changed the title ~~[WIP]WM v2 Memory Preprocessing~~ [WIP] 测试中 - WM v2 Memory Preprocessing May 20, 2026

jcp0578 marked this pull request as ready for review May 20, 2026 10:09

jcp0578 force-pushed the feat/wm-v2-token-distillation branch 2 times, most recently from 4b5e8b6 to 7026535 Compare May 20, 2026 10:23

Merge branch 'main' into feat/wm-v2-token-distillation

c55362a

jcp0578 force-pushed the feat/wm-v2-token-distillation branch from 0df5462 to c55362a Compare May 20, 2026 12:41

jcp0578 added 6 commits May 20, 2026 21:28

improve(preprocessor): dedupe wiring, add creation telemetry, and exp…

ecb554e

…ose savings threshold

fix(preprocessor): harden preference and metadata signal detection

77ec9c5

fix(preprocessor): tighten signal matching and span selection

7dc5ab2

fix(preprocessor): align token estimation and share WM constants

a66f4f9

revert(openclaw-plugin): drop auto-recall changes

e543727

Merge remote-tracking branch 'origin/main' into feat/wm-v2-token-dist…

33211f5

…illation

style(preprocessor): format WM preprocessing files and fixtures

f9ad256

jcp0578 force-pushed the feat/wm-v2-token-distillation branch from 0f9c079 to 524c8cc Compare May 21, 2026 15:53

Merge branch 'main' into feat/wm-v2-token-distillation

8b310ad

jcp0578 force-pushed the feat/wm-v2-token-distillation branch from 524c8cc to 8b310ad Compare May 21, 2026 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] 测试中 - WM v2 Memory Preprocessing#2131

[WIP] 测试中 - WM v2 Memory Preprocessing#2131
jcp0578 wants to merge 13 commits into
volcengine:mainfrom
jcp0578:feat/wm-v2-token-distillation

jcp0578 commented May 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

jcp0578 commented May 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jcp0578 commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Type of Change

Changes Made

Testing

Checklist

Screenshots (if applicable)

Additional Notes

Uh oh!

github-actions Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit f34913b)

Uh oh!

github-actions Bot commented May 19, 2026

PR Code Suggestions ✨

Uh oh!

jcp0578 commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jcp0578 commented May 19, 2026 •

edited

Loading

github-actions Bot commented May 19, 2026 •

edited

Loading

(Review updated until commit `f34913b`)

jcp0578 commented May 19, 2026 •

edited

Loading