refactor: consolidate workspace crates — 14→11 (delete tui-core, merge hooks+agent)#2256
refactor: consolidate workspace crates — 14→11 (delete tui-core, merge hooks+agent)#2256Hmbown wants to merge 17 commits into
Conversation
~/.agents/ is a vendor-neutral location intended to be shared across tools. WHALE.md is CodeWhale-specific — placing it there at priority 3 ahead of AGENTS.md at priority 4 silently shadows a user's cross-tool config. Restrict ~/.agents/ to AGENTS.md only. Resolves Greptile review issues on PR #2236.
…ormat - New verify.rs: closed-loop verification gate re-checks side-effect tool claims (write_file, edit_file, apply_patch, exec_shell) before the result enters the session stream. [VERIFY FAIL] annotations flag contradictions; verdicts recorded in session verification ledger. Enabled by default; configurable via [verification] in config.toml. - All native tools loaded upfront: removed deferred loading policy. Every tool is visible from turn one instead of requiring tool_search discovery. 80+ tools in catalog, ~1% context overhead. - response_format field on MessageRequest: supports DeepSeek JSON Output mode for structured responses when needed. - Constitution (base.md): clarified 'begin with an A' as grade metaphor (A+ awarded in advance). Article I connected to credit-first stance. Article V updated with structural verification note. - System prompt: verification gate statute added. - config.example.toml: [verification] section. - docs/ARCHITECTURE.md: step 7.5 verification gate in data flow.
…gration/v0.8.47-unified
- Workspace version: 0.8.46 → 0.8.47 - All inter-crate dependency versions updated - Merged PR #2236: global AGENTS.md from ~/.agents/
- New ProviderKind::Xiaomi across config, TUI, CLI, and agent crates - Default endpoint: https://token-plan-cn.xiaomimimo.com/v1 - Models: mimo-v2.5-pro, mimo-v2.5, mimo-v2-flash - MiMo-specific thinking toggle (enabled/disabled) - /models command for Xiaomi provider - Provider-specific model picker UI Co-Authored-By: Hu Qiantao <huqiantao@HudeMacBook-Air.local>
DEEPSEEK.md is redundant — WHALE.md, AGENTS.md, and CLAUDE.md already cover the project-context surface. Adding another filename just increases the scan surface without a distinct use case.
From @reidliu41 — the /provider modal now sizes dynamically, scrolls to keep the selected row visible, wraps Up/Down between first and last providers, and renders the selected row with a full-row highlight.
CHANGELOG: [Unreleased] promoted to [0.8.47] with 16 Added, 5 Changed, 11 Fixed entries covering closed-loop verification, all-tools-loaded, runtime goal tools, DDG default, Xiaomi MiMo, global AGENTS.md fallback, DeepSeek V4 Pro pricing, and PR #2241 merge. READMEs: 15 new contributors added across en/zh-CN/ja-JP: @Fire-dtx, @imkingjh999, @harvey2011888, @victorcheng2333, @IIzzaya, @PurplePulse, @cyq1017, @knqiufan, @Colorful-glassblock, @hongqitai, @EmiyaKiritsugu3, @HUQIANTAO, @mvanhorn. Updated @reidliu41 and @aboimpinto entries with new PRs.
- Merged #2235 (/new session command from @reidliu41, all CI green) - Added Xiaomi MiMo to provider examples and env-var table - Added verification gate paragraph in harness section - Updated @reidliu41 entries with #2235 across all READMEs - CHANGELOG: added /new command entry
- Provider picker: Xiaomi removed from provider_passes_model_through so models aren't passed through verbatim. switch_provider now clears default_text_model when switching to a different provider, ensuring the new provider's default model (mimo-v2.5-pro for Xiaomi) is used. - Footer: removed 'saved $X.XX' cache-savings hint. Cache pricing is the normal price, not savings. Removed dead last_turn_cache_savings and calculate_cache_savings functions. - CHANGELOG: crates/tui/CHANGELOG.md replaced with symlink to root CHANGELOG.md, fixing the changelog_entry_exists test. - Test: provider picker wrap test updated for Xiaomi as last provider.
The verification gate now does inline file checks for write_file, edit_file, and apply_patch — actually reading the file back to confirm the operation landed instead of returning Pass blindly. When a [VERIFY FAIL] is detected, file-mutating tools are automatically re-executed with the same input (up to verification_max_retries, default 2). Successful retries are annotated as '[VERIFY PASS] auto-retried N time(s)' instead of showing failure annotations. This collapses mechanical retries into a single turn — the model never sees routine I/O flakiness. Pure engine retry, no LLM dependency. The Flash/Fir inner-loop for edit_file search correction is the next layer. Changes: - verify.rs: run_verification now calls inline_verify_file_tool for write_file/edit_file/apply_patch; is_auto_retryable helper added - tool_execution.rs: Engine::retry_file_tool re-executes tools - turn_loop.rs: retry loop around verification block, collapses retry annotations - EngineConfig: verification_max_retries field (default 2) - config.example.toml: max_retries updated to 2
When edit_file fails with 'search string not found', the engine now
runs a deterministic Levenshtein-based fuzzy matcher against the file
to find the closest matching text. If a match exceeds the 60% similarity
threshold, the search string is auto-corrected and the tool is retried.
The fuzzy matcher handles whitespace diffs, bracket mismatches, and
minor typos — the common cases where the model wrote 'fn main() {'
but the file has 'fn main() {' or 'pub fn main() {'.
This runs in-process, sub-millisecond, zero API cost. Option B (Fin
Flash inner-loop for edge cases where the fuzzy matcher can't find
a match) is marked as a TODO in the error handler — the hook point
is ready, just needs the Flash API call wired in.
New in verify.rs:
- fuzzy_correct_search(): Levenshtein matcher with line-by-line and
multi-line window comparison
- correct_edit_file_input(): builds corrected tool input with
fuzzy_correction metadata annotation
- 4 new tests covering exact match, whitespace diff, unrelated search,
and Levenshtein distance correctness
When the deterministic fuzzy matcher can't find a close match (score < 60%), the engine now spawns a Fin Flash call (deepseek-v4-flash, thinking off) with the file content and failed search string. Flash returns the closest matching text, and the edit_file is retried with the corrected search string. The Flash call is cheap (~sh.0001, ~200ms) because it shares the prefix cache already primed by the parent turn. This completes the three-tier correction strategy: 1. Pure engine retry (same input, I/O flakiness) 2. Fuzzy matcher (Levenshtein, in-process, zero cost) 3. Fin Flash (LLM-powered, cache-shared, near-zero cost) New in verify.rs: flash_correct_search() builds a minimal MessageRequest for Flash and parses the response. Wired into turn_loop.rs error handler at the Option B fallback point.
…or credits - CHANGELOG: add @LING71671 (#1797) contributor credit - CHANGELOG: add v0.8.47 compare link, bump Unreleased ref - npm: bump codewhale + deepseek-tui to 0.8.47 - clippy: fix unused_assignments, dead_code, collapsible_if warnings - cargo fmt: normalize indentation across tui crate
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
| match std::fs::read_to_string(&resolved) { | ||
| Ok(content) if !content.is_empty() => VerifyVerdict::Pass, | ||
| Ok(_) => VerifyVerdict::Fail { | ||
| expected: format!("non-empty file at {}", resolved.display()), | ||
| observed: "file is empty".to_string(), | ||
| }, | ||
| Err(_) => VerifyVerdict::Fail { | ||
| expected: format!("file exists at {}", resolved.display()), | ||
| observed: "file missing after write".to_string(), | ||
| }, | ||
| } |
There was a problem hiding this comment.
False
VERIFY FAIL on intentionally empty files
inline_verify_file_tool treats an empty file as a verification failure. Any call to write_file or edit_file that legitimately produces zero bytes (e.g. truncating a config file, creating an empty marker file) will fail this check, trigger up to 2 automatic retries, and ultimately inject [VERIFY FAIL] into the session stream — telling the LLM the write failed when it actually succeeded. Since verification_enabled is hardcoded true in build_engine_config, this affects every user.
| match std::fs::read_to_string(&resolved) { | |
| Ok(content) if !content.is_empty() => VerifyVerdict::Pass, | |
| Ok(_) => VerifyVerdict::Fail { | |
| expected: format!("non-empty file at {}", resolved.display()), | |
| observed: "file is empty".to_string(), | |
| }, | |
| Err(_) => VerifyVerdict::Fail { | |
| expected: format!("file exists at {}", resolved.display()), | |
| observed: "file missing after write".to_string(), | |
| }, | |
| } | |
| match std::fs::metadata(&resolved) { | |
| Ok(_) => VerifyVerdict::Pass, | |
| Err(_) => VerifyVerdict::Fail { | |
| expected: format!("file exists at {}", resolved.display()), | |
| observed: "file missing after write".to_string(), | |
| }, | |
| } |
| if !corrected { | ||
| match loop_guard.record_outcome(&outcome.name, false) { | ||
| OutcomeDecision::Continue => {} | ||
| OutcomeDecision::Warn(message) => { | ||
| crate::logging::warn(message.clone()); | ||
| let _ = self.tx_event.send(Event::status(message)).await; | ||
| } | ||
| OutcomeDecision::Halt(message) => { | ||
| loop_guard_halt.get_or_insert(message); | ||
| } | ||
| } | ||
| } | ||
| let envelope: ErrorEnvelope = e.clone().into(); | ||
| emit_tool_audit(json!({ | ||
| "event": "tool.result", | ||
| "tool_id": outcome.id.clone(), | ||
| "tool_name": outcome.name.clone(), | ||
| "success": false, | ||
| "error": e.to_string(), | ||
| "category": envelope.category.to_string(), | ||
| "severity": envelope.severity.to_string(), | ||
| })); | ||
| step_error_count += 1; | ||
| step_error_categories.push(envelope.category); | ||
| if corrected { | ||
| // Already handled above — skip error recording. | ||
| } else { | ||
| let envelope: ErrorEnvelope = e.clone().into(); | ||
| emit_tool_audit(json!({ | ||
| "event": "tool.result", | ||
| "tool_id": outcome.id.clone(), | ||
| "tool_name": outcome.name.clone(), | ||
| "success": false, | ||
| "error": e.to_string(), | ||
| "category": envelope.category.to_string(), | ||
| "severity": envelope.severity.to_string(), | ||
| })); | ||
| step_error_count += 1; | ||
| step_error_categories.push(envelope.category); | ||
| } |
There was a problem hiding this comment.
corrected is always false — if corrected branch is dead code
corrected is declared as an immutable false and never reassigned. Both if !corrected { … } (loop guard recording) and the else branch of if corrected { … } (error audit + step_error_count) therefore always execute. The successful-correction branches already use continue to skip this code, so the observable behaviour is correct — but the dead if corrected { } block is misleading and the mut intent was clearly lost. The variable should be removed and the two conditional blocks flattened.
| if !corrected { | |
| match loop_guard.record_outcome(&outcome.name, false) { | |
| OutcomeDecision::Continue => {} | |
| OutcomeDecision::Warn(message) => { | |
| crate::logging::warn(message.clone()); | |
| let _ = self.tx_event.send(Event::status(message)).await; | |
| } | |
| OutcomeDecision::Halt(message) => { | |
| loop_guard_halt.get_or_insert(message); | |
| } | |
| } | |
| } | |
| let envelope: ErrorEnvelope = e.clone().into(); | |
| emit_tool_audit(json!({ | |
| "event": "tool.result", | |
| "tool_id": outcome.id.clone(), | |
| "tool_name": outcome.name.clone(), | |
| "success": false, | |
| "error": e.to_string(), | |
| "category": envelope.category.to_string(), | |
| "severity": envelope.severity.to_string(), | |
| })); | |
| step_error_count += 1; | |
| step_error_categories.push(envelope.category); | |
| if corrected { | |
| // Already handled above — skip error recording. | |
| } else { | |
| let envelope: ErrorEnvelope = e.clone().into(); | |
| emit_tool_audit(json!({ | |
| "event": "tool.result", | |
| "tool_id": outcome.id.clone(), | |
| "tool_name": outcome.name.clone(), | |
| "success": false, | |
| "error": e.to_string(), | |
| "category": envelope.category.to_string(), | |
| "severity": envelope.severity.to_string(), | |
| })); | |
| step_error_count += 1; | |
| step_error_categories.push(envelope.category); | |
| } | |
| match loop_guard.record_outcome(&outcome.name, false) { | |
| OutcomeDecision::Continue => {} | |
| OutcomeDecision::Warn(message) => { | |
| crate::logging::warn(message.clone()); | |
| let _ = self.tx_event.send(Event::status(message)).await; | |
| } | |
| OutcomeDecision::Halt(message) => { | |
| loop_guard_halt.get_or_insert(message); | |
| } | |
| } | |
| let envelope: ErrorEnvelope = e.clone().into(); | |
| emit_tool_audit(json!({ | |
| "event": "tool.result", | |
| "tool_id": outcome.id.clone(), | |
| "tool_name": outcome.name.clone(), | |
| "success": false, | |
| "error": e.to_string(), | |
| "category": envelope.category.to_string(), | |
| "severity": envelope.severity.to_string(), | |
| })); | |
| step_error_count += 1; | |
| step_error_categories.push(envelope.category); |
| if outcome.name == "edit_file" | ||
| && verify::correct_edit_file_input( | ||
| &tool_input, | ||
| &error_str, | ||
| &self.session.workspace, | ||
| ) | ||
| .is_some() | ||
| { | ||
| let corrected_input = verify::correct_edit_file_input( | ||
| &tool_input, | ||
| &error_str, | ||
| &self.session.workspace, | ||
| ) | ||
| .unwrap(); |
There was a problem hiding this comment.
correct_edit_file_input called twice per failure — the function reads the file and runs fuzzy matching both times. Use if let Some(corrected_input) = … to compute it once.
| if outcome.name == "edit_file" | |
| && verify::correct_edit_file_input( | |
| &tool_input, | |
| &error_str, | |
| &self.session.workspace, | |
| ) | |
| .is_some() | |
| { | |
| let corrected_input = verify::correct_edit_file_input( | |
| &tool_input, | |
| &error_str, | |
| &self.session.workspace, | |
| ) | |
| .unwrap(); | |
| if outcome.name == "edit_file" | |
| && let Some(corrected_input) = verify::correct_edit_file_input( | |
| &tool_input, | |
| &error_str, | |
| &self.session.workspace, | |
| ) | |
| { |
| let elapsed_ms = started.elapsed().as_millis() as u64; | ||
| let _ = elapsed_ms; |
There was a problem hiding this comment.
Verification ledger always records
elapsed_ms: 0 — started.elapsed() is computed inside run_verification and then discarded with let _ = elapsed_ms;. The VerifyRecord at the call site in turn_loop.rs then hardcodes elapsed_ms: 0. Either return the elapsed duration from run_verification or measure it at the call site.
| let elapsed_ms = started.elapsed().as_millis() as u64; | |
| let _ = elapsed_ms; | |
| let elapsed_ms = started.elapsed().as_millis() as u64; |
Summary
Reduces the Cargo workspace from 14 crates to 11 by removing dead code and merging thin abstraction layers. Zero behavioral changes.
Changes
tui-corecorehookscoreandapp-serverused it;app-serveralready depends oncore. No new edges.configagentconfig(ProviderKind); all consumers already depend onconfig. No new edges.Why these three?
tui-core: Dead code. Nouse codewhale_tui_coreanywhere in the tree.hooks(296 LOC): Onlycore+app-serverconsumers.app-serveralready takescore.agent(539 LOC): Thin model registry that only depends onconfigforProviderKind. All 3 consumers already takeconfig.Why NOT
execpolicy/mcp/tools?They're all consumed by
clidirectly, butclidoesn't depend oncore. Merging them intocorewould forceclito pull in the full orchestration crate (+ transitive deps). They stay separate.Verification
cargo check --workspace✅cargo build --workspace✅cargo test --workspace✅ — 3,843 tests passed, 0 failedCloses #2249, closes #2250, closes #2251
Greptile Summary
This PR bundles a workspace consolidation (14→11 crates, removing
tui-core, merginghooks→coreandagent→config) with three substantive feature additions: Xiaomi MiMo provider support, a closed-loop verification/auto-retry gate for file-mutating tools, and V4-Pro pricing made permanent.tui-coredeleted (zero dependents),hooksmerged intocore, andagentmerged intoconfig; all version bumps from 0.8.46→0.8.47.ProviderKind::Xiaomiadded end-to-end — config, env vars (MIMO_API_KEY,MIMO_BASE_URL), model registry (mimo-v2.5-pro,mimo-v2.5), provider/model pickers, andcrates/secretsenv mapping.verify.rs+turn_loop.rs): after every file-mutating tool (write_file,edit_file,apply_patch) the engine reads the file back to confirm the write landed; failures trigger auto-retry (up to 2×) and inject[VERIFY FAIL]annotations into the session stream. A secondary fuzzy + Flash-LLM correction path handlesedit_file"search string not found" errors.Confidence Score: 3/5
The workspace consolidation and Xiaomi provider additions are clean. The verification gate introduces a real behavioural defect for empty-file writes, and
turn_loop.rscarries dead/misleading control-flow around the correction variable.The verification gate always flags a successfully-written empty file as a failure, retrying it twice and then sending
[VERIFY FAIL]to the model. Thecorrectedvariable inturn_loop.rsis declared immutablefalseand never changes, making one branch dead and the other always-active — while the correct path relies oncontinuestatements that happen to rescue the logic, the intent expressed in the code is wrong.crates/tui/src/core/engine/verify.rs (empty-file check in
inline_verify_file_tool) and crates/tui/src/core/engine/turn_loop.rs (deadcorrectedvariable and double call tocorrect_edit_file_input)Important Files Changed
inline_verify_file_tooltreats empty-file writes as failures, causing false[VERIFY FAIL]annotations and unnecessary retries for intentionally empty files. Elapsed timing is computed but discarded, so all ledger records storeelapsed_ms: 0.let corrected = falseis never mutated, makingif corrected { … }dead code andif !corrected { … }always true.correct_edit_file_inputis also called twice per failure (.is_some()+.unwrap()), duplicating fuzzy-match work.calculate_cache_savingsfunction removed.verification_enabled: trueandverification_max_retries: 2inbuild_engine_config; these values are not user-configurable, and overlap with the now-unusedEngineConfigdefaults and theVerifyConfigstruct.ApiProviderenum and all switch arms; default model/URL constants defined;provider_capabilityblock added for Xiaomi models.~/.agents/AGENTS.mdas a vendor-neutral fallback between.codewhale/and.deepseek/; priority order is correct and well-tested.MIMO_API_KEYlookup forxiaomi/mimo/xiaomi-mimoprovider names inenv_for; follows existing pattern exactly.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[Tool executes] --> B{Ok or Err?} B -->|Ok - success| C{verification_enabled AND tool_was_executed?} C -->|No| D[Add session message with raw output] C -->|Yes| E[run_verification tool_name + tool_input] E --> F{VerifyVerdict?} F -->|Pass / Skipped / Unverifiable| G[Annotate output with verdict] F -->|Fail AND retryable AND retry_count lt max| H[retry_file_tool retry_count++] H --> E F -->|Fail AND retries exhausted| I[Annotate VERIFY FAIL] G --> D I --> D2[Add session message with VERIFY FAIL annotation is_error=Some true] B -->|Err - failure| J{outcome.name == edit_file AND correct_edit_file_input is Some?} J -->|No| K[record_outcome emit_tool_audit step_error_count++ Add Error message] J -->|Yes - fuzzy match found| L[retry_file_tool with corrected input] L -->|retry_ok=true| M[continue skip error path] L -->|retry_ok=false| N{deepseek_client available? Flash LLM correction} N -->|Flash succeeds| O[retry_file_tool with Flash-corrected input] O -->|ok=true| M O -->|failed| K N -->|Flash unavailable or fails| KReviews (1): Last reviewed commit: "chore(release): finalize v0.8.47 — clipp..." | Re-trigger Greptile