Skip to content

refactor: consolidate workspace crates — 14→11 (delete tui-core, merge hooks+agent)#2256

Open
Hmbown wants to merge 17 commits into
mainfrom
feature/v0.8.48-crate-consolidation
Open

refactor: consolidate workspace crates — 14→11 (delete tui-core, merge hooks+agent)#2256
Hmbown wants to merge 17 commits into
mainfrom
feature/v0.8.48-crate-consolidation

Conversation

@Hmbown
Copy link
Copy Markdown
Owner

@Hmbown Hmbown commented May 27, 2026

Summary

Reduces the Cargo workspace from 14 crates to 11 by removing dead code and merging thin abstraction layers. Zero behavioral changes.

Changes

Action Crate Lines Rationale
❌ Deleted tui-core 192 Orphaned — zero dependents, 192-line scaffold never wired to the real TUI
🔀 Merged → core hooks 296 Only core and app-server used it; app-server already depends on core. No new edges.
🔀 Merged → config agent 539 Only depended on config (ProviderKind); all consumers already depend on config. No new edges.

Why these three?

  • tui-core: Dead code. No use codewhale_tui_core anywhere in the tree.
  • hooks (296 LOC): Only core + app-server consumers. app-server already takes core.
  • agent (539 LOC): Thin model registry that only depends on config for ProviderKind. All 3 consumers already take config.

Why NOT execpolicy/mcp/tools?

They're all consumed by cli directly, but cli doesn't depend on core. Merging them into core would force cli to pull in the full orchestration crate (+ transitive deps). They stay separate.

Verification

  • cargo check --workspace
  • cargo build --workspace
  • cargo test --workspace ✅ — 3,843 tests passed, 0 failed

Closes #2249, closes #2250, closes #2251

Greptile Summary

This PR bundles a workspace consolidation (14→11 crates, removing tui-core, merging hookscore and agentconfig) with three substantive feature additions: Xiaomi MiMo provider support, a closed-loop verification/auto-retry gate for file-mutating tools, and V4-Pro pricing made permanent.

  • Workspace consolidation: tui-core deleted (zero dependents), hooks merged into core, and agent merged into config; all version bumps from 0.8.46→0.8.47.
  • Xiaomi MiMo provider: ProviderKind::Xiaomi added end-to-end — config, env vars (MIMO_API_KEY, MIMO_BASE_URL), model registry (mimo-v2.5-pro, mimo-v2.5), provider/model pickers, and crates/secrets env mapping.
  • Verification gate (verify.rs + turn_loop.rs): after every file-mutating tool (write_file, edit_file, apply_patch) the engine reads the file back to confirm the write landed; failures trigger auto-retry (up to 2×) and inject [VERIFY FAIL] annotations into the session stream. A secondary fuzzy + Flash-LLM correction path handles edit_file "search string not found" errors.

Confidence Score: 3/5

The workspace consolidation and Xiaomi provider additions are clean. The verification gate introduces a real behavioural defect for empty-file writes, and turn_loop.rs carries dead/misleading control-flow around the correction variable.

The verification gate always flags a successfully-written empty file as a failure, retrying it twice and then sending [VERIFY FAIL] to the model. The corrected variable in turn_loop.rs is declared immutable false and never changes, making one branch dead and the other always-active — while the correct path relies on continue statements that happen to rescue the logic, the intent expressed in the code is wrong.

crates/tui/src/core/engine/verify.rs (empty-file check in inline_verify_file_tool) and crates/tui/src/core/engine/turn_loop.rs (dead corrected variable and double call to correct_edit_file_input)

Important Files Changed

Filename Overview
crates/tui/src/core/engine/verify.rs New 815-line verification gate; inline_verify_file_tool treats empty-file writes as failures, causing false [VERIFY FAIL] annotations and unnecessary retries for intentionally empty files. Elapsed timing is computed but discarded, so all ledger records store elapsed_ms: 0.
crates/tui/src/core/engine/turn_loop.rs Adds verification gate invocation and fuzzy/Flash edit-correction path; let corrected = false is never mutated, making if corrected { … } dead code and if !corrected { … } always true. correct_edit_file_input is also called twice per failure (.is_some() + .unwrap()), duplicating fuzzy-match work.
crates/config/src/lib.rs Xiaomi/MiMo provider added symmetrically with all other providers — get/set/unset/list/env, secrets lookup, and merge_project_provider_config all updated correctly.
crates/tui/src/pricing.rs Time-limited V4-Pro discount removed; discount is now permanent. Old discount-expiry tests replaced with always-pass variants; dead calculate_cache_savings function removed.
crates/tui/src/tui/ui.rs Hardcodes verification_enabled: true and verification_max_retries: 2 in build_engine_config; these values are not user-configurable, and overlap with the now-unused EngineConfig defaults and the VerifyConfig struct.
crates/tui/src/config.rs Xiaomi MiMo provider added to ApiProvider enum and all switch arms; default model/URL constants defined; provider_capability block added for Xiaomi models.
crates/tui/src/project_context.rs Adds ~/.agents/AGENTS.md as a vendor-neutral fallback between .codewhale/ and .deepseek/; priority order is correct and well-tested.
crates/secrets/src/lib.rs Adds MIMO_API_KEY lookup for xiaomi/mimo/xiaomi-mimo provider names in env_for; follows existing pattern exactly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Tool executes] --> B{Ok or Err?}

    B -->|Ok - success| C{verification_enabled AND tool_was_executed?}
    C -->|No| D[Add session message with raw output]
    C -->|Yes| E[run_verification tool_name + tool_input]

    E --> F{VerifyVerdict?}
    F -->|Pass / Skipped / Unverifiable| G[Annotate output with verdict]
    F -->|Fail AND retryable AND retry_count lt max| H[retry_file_tool retry_count++]
    H --> E
    F -->|Fail AND retries exhausted| I[Annotate VERIFY FAIL]

    G --> D
    I --> D2[Add session message with VERIFY FAIL annotation is_error=Some true]

    B -->|Err - failure| J{outcome.name == edit_file AND correct_edit_file_input is Some?}

    J -->|No| K[record_outcome emit_tool_audit step_error_count++ Add Error message]
    J -->|Yes - fuzzy match found| L[retry_file_tool with corrected input]

    L -->|retry_ok=true| M[continue skip error path]
    L -->|retry_ok=false| N{deepseek_client available? Flash LLM correction}
    N -->|Flash succeeds| O[retry_file_tool with Flash-corrected input]
    O -->|ok=true| M
    O -->|failed| K
    N -->|Flash unavailable or fails| K
Loading

Fix All in Devin

Reviews (1): Last reviewed commit: "chore(release): finalize v0.8.47 — clipp..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

mvanhorn and others added 17 commits May 26, 2026 17:13
~/.agents/ is a vendor-neutral location intended to be shared across
tools. WHALE.md is CodeWhale-specific — placing it there at priority 3
ahead of AGENTS.md at priority 4 silently shadows a user's
cross-tool config. Restrict ~/.agents/ to AGENTS.md only.

Resolves Greptile review issues on PR #2236.
…ormat

- New verify.rs: closed-loop verification gate re-checks side-effect
  tool claims (write_file, edit_file, apply_patch, exec_shell) before
  the result enters the session stream. [VERIFY FAIL] annotations flag
  contradictions; verdicts recorded in session verification ledger.
  Enabled by default; configurable via [verification] in config.toml.

- All native tools loaded upfront: removed deferred loading policy.
  Every tool is visible from turn one instead of requiring tool_search
  discovery. 80+ tools in catalog, ~1% context overhead.

- response_format field on MessageRequest: supports DeepSeek JSON
  Output mode for structured responses when needed.

- Constitution (base.md): clarified 'begin with an A' as grade
  metaphor (A+ awarded in advance). Article I connected to credit-first
  stance. Article V updated with structural verification note.

- System prompt: verification gate statute added.

- config.example.toml: [verification] section.
- docs/ARCHITECTURE.md: step 7.5 verification gate in data flow.
- Workspace version: 0.8.46 → 0.8.47
- All inter-crate dependency versions updated
- Merged PR #2236: global AGENTS.md from ~/.agents/
- #1937: Make DeepSeek V4 Pro 75% discount permanent (pricing.rs)
- #2237: Fix two clippy warnings (config.rs, runtime_log.rs)
- #1852: Add DEEPSEEK.md as project context file
- #1910: Suppress verbose CLI logging on Windows alt-screen
- New ProviderKind::Xiaomi across config, TUI, CLI, and agent crates
- Default endpoint: https://token-plan-cn.xiaomimimo.com/v1
- Models: mimo-v2.5-pro, mimo-v2.5, mimo-v2-flash
- MiMo-specific thinking toggle (enabled/disabled)
- /models command for Xiaomi provider
- Provider-specific model picker UI

Co-Authored-By: Hu Qiantao <huqiantao@HudeMacBook-Air.local>
DEEPSEEK.md is redundant — WHALE.md, AGENTS.md, and CLAUDE.md already
cover the project-context surface. Adding another filename just
increases the scan surface without a distinct use case.
From @reidliu41 — the /provider modal now sizes dynamically, scrolls to
keep the selected row visible, wraps Up/Down between first and last
providers, and renders the selected row with a full-row highlight.
CHANGELOG: [Unreleased] promoted to [0.8.47] with 16 Added, 5 Changed,
11 Fixed entries covering closed-loop verification, all-tools-loaded,
runtime goal tools, DDG default, Xiaomi MiMo, global AGENTS.md fallback,
DeepSeek V4 Pro pricing, and PR #2241 merge.

READMEs: 15 new contributors added across en/zh-CN/ja-JP:
@Fire-dtx, @imkingjh999, @harvey2011888, @victorcheng2333, @IIzzaya,
@PurplePulse, @cyq1017, @knqiufan, @Colorful-glassblock, @hongqitai,
@EmiyaKiritsugu3, @HUQIANTAO, @mvanhorn.
Updated @reidliu41 and @aboimpinto entries with new PRs.
- Merged #2235 (/new session command from @reidliu41, all CI green)
- Added Xiaomi MiMo to provider examples and env-var table
- Added verification gate paragraph in harness section
- Updated @reidliu41 entries with #2235 across all READMEs
- CHANGELOG: added /new command entry
- Provider picker: Xiaomi removed from provider_passes_model_through so
  models aren't passed through verbatim. switch_provider now clears
  default_text_model when switching to a different provider, ensuring
  the new provider's default model (mimo-v2.5-pro for Xiaomi) is used.

- Footer: removed 'saved $X.XX' cache-savings hint. Cache pricing is
  the normal price, not savings. Removed dead last_turn_cache_savings
  and calculate_cache_savings functions.

- CHANGELOG: crates/tui/CHANGELOG.md replaced with symlink to root
  CHANGELOG.md, fixing the changelog_entry_exists test.

- Test: provider picker wrap test updated for Xiaomi as last provider.
The verification gate now does inline file checks for write_file,
edit_file, and apply_patch — actually reading the file back to
confirm the operation landed instead of returning Pass blindly.

When a [VERIFY FAIL] is detected, file-mutating tools are
automatically re-executed with the same input (up to
verification_max_retries, default 2). Successful retries are
annotated as '[VERIFY PASS] auto-retried N time(s)' instead of
showing failure annotations. This collapses mechanical retries
into a single turn — the model never sees routine I/O flakiness.

Pure engine retry, no LLM dependency. The Flash/Fir inner-loop
for edit_file search correction is the next layer.

Changes:
- verify.rs: run_verification now calls inline_verify_file_tool
  for write_file/edit_file/apply_patch; is_auto_retryable helper
  added
- tool_execution.rs: Engine::retry_file_tool re-executes tools
- turn_loop.rs: retry loop around verification block, collapses
  retry annotations
- EngineConfig: verification_max_retries field (default 2)
- config.example.toml: max_retries updated to 2
When edit_file fails with 'search string not found', the engine now
runs a deterministic Levenshtein-based fuzzy matcher against the file
to find the closest matching text. If a match exceeds the 60% similarity
threshold, the search string is auto-corrected and the tool is retried.

The fuzzy matcher handles whitespace diffs, bracket mismatches, and
minor typos — the common cases where the model wrote 'fn main() {'
but the file has 'fn main()  {' or 'pub fn main() {'.

This runs in-process, sub-millisecond, zero API cost. Option B (Fin
Flash inner-loop for edge cases where the fuzzy matcher can't find
a match) is marked as a TODO in the error handler — the hook point
is ready, just needs the Flash API call wired in.

New in verify.rs:
- fuzzy_correct_search(): Levenshtein matcher with line-by-line and
  multi-line window comparison
- correct_edit_file_input(): builds corrected tool input with
  fuzzy_correction metadata annotation
- 4 new tests covering exact match, whitespace diff, unrelated search,
  and Levenshtein distance correctness
When the deterministic fuzzy matcher can't find a close match (score
< 60%), the engine now spawns a Fin Flash call (deepseek-v4-flash,
thinking off) with the file content and failed search string. Flash
returns the closest matching text, and the edit_file is retried with
the corrected search string.

The Flash call is cheap (~sh.0001, ~200ms) because it shares the
prefix cache already primed by the parent turn. This completes the
three-tier correction strategy:
  1. Pure engine retry (same input, I/O flakiness)
  2. Fuzzy matcher (Levenshtein, in-process, zero cost)
  3. Fin Flash (LLM-powered, cache-shared, near-zero cost)

New in verify.rs: flash_correct_search() builds a minimal MessageRequest
for Flash and parses the response. Wired into turn_loop.rs error handler
at the Option B fallback point.
…or credits

- CHANGELOG: add @LING71671 (#1797) contributor credit
- CHANGELOG: add v0.8.47 compare link, bump Unreleased ref
- npm: bump codewhale + deepseek-tui to 0.8.47
- clippy: fix unused_assignments, dead_code, collapsible_if warnings
- cargo fmt: normalize indentation across tui crate
Copilot AI review requested due to automatic review settings May 27, 2026 04:56
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Comment on lines +229 to +239
match std::fs::read_to_string(&resolved) {
Ok(content) if !content.is_empty() => VerifyVerdict::Pass,
Ok(_) => VerifyVerdict::Fail {
expected: format!("non-empty file at {}", resolved.display()),
observed: "file is empty".to_string(),
},
Err(_) => VerifyVerdict::Fail {
expected: format!("file exists at {}", resolved.display()),
observed: "file missing after write".to_string(),
},
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 False VERIFY FAIL on intentionally empty files

inline_verify_file_tool treats an empty file as a verification failure. Any call to write_file or edit_file that legitimately produces zero bytes (e.g. truncating a config file, creating an empty marker file) will fail this check, trigger up to 2 automatic retries, and ultimately inject [VERIFY FAIL] into the session stream — telling the LLM the write failed when it actually succeeded. Since verification_enabled is hardcoded true in build_engine_config, this affects every user.

Suggested change
match std::fs::read_to_string(&resolved) {
Ok(content) if !content.is_empty() => VerifyVerdict::Pass,
Ok(_) => VerifyVerdict::Fail {
expected: format!("non-empty file at {}", resolved.display()),
observed: "file is empty".to_string(),
},
Err(_) => VerifyVerdict::Fail {
expected: format!("file exists at {}", resolved.display()),
observed: "file missing after write".to_string(),
},
}
match std::fs::metadata(&resolved) {
Ok(_) => VerifyVerdict::Pass,
Err(_) => VerifyVerdict::Fail {
expected: format!("file exists at {}", resolved.display()),
observed: "file missing after write".to_string(),
},
}

Fix in Devin

Comment on lines +2215 to +2242
if !corrected {
match loop_guard.record_outcome(&outcome.name, false) {
OutcomeDecision::Continue => {}
OutcomeDecision::Warn(message) => {
crate::logging::warn(message.clone());
let _ = self.tx_event.send(Event::status(message)).await;
}
OutcomeDecision::Halt(message) => {
loop_guard_halt.get_or_insert(message);
}
}
}
let envelope: ErrorEnvelope = e.clone().into();
emit_tool_audit(json!({
"event": "tool.result",
"tool_id": outcome.id.clone(),
"tool_name": outcome.name.clone(),
"success": false,
"error": e.to_string(),
"category": envelope.category.to_string(),
"severity": envelope.severity.to_string(),
}));
step_error_count += 1;
step_error_categories.push(envelope.category);
if corrected {
// Already handled above — skip error recording.
} else {
let envelope: ErrorEnvelope = e.clone().into();
emit_tool_audit(json!({
"event": "tool.result",
"tool_id": outcome.id.clone(),
"tool_name": outcome.name.clone(),
"success": false,
"error": e.to_string(),
"category": envelope.category.to_string(),
"severity": envelope.severity.to_string(),
}));
step_error_count += 1;
step_error_categories.push(envelope.category);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 corrected is always falseif corrected branch is dead code

corrected is declared as an immutable false and never reassigned. Both if !corrected { … } (loop guard recording) and the else branch of if corrected { … } (error audit + step_error_count) therefore always execute. The successful-correction branches already use continue to skip this code, so the observable behaviour is correct — but the dead if corrected { } block is misleading and the mut intent was clearly lost. The variable should be removed and the two conditional blocks flattened.

Suggested change
if !corrected {
match loop_guard.record_outcome(&outcome.name, false) {
OutcomeDecision::Continue => {}
OutcomeDecision::Warn(message) => {
crate::logging::warn(message.clone());
let _ = self.tx_event.send(Event::status(message)).await;
}
OutcomeDecision::Halt(message) => {
loop_guard_halt.get_or_insert(message);
}
}
}
let envelope: ErrorEnvelope = e.clone().into();
emit_tool_audit(json!({
"event": "tool.result",
"tool_id": outcome.id.clone(),
"tool_name": outcome.name.clone(),
"success": false,
"error": e.to_string(),
"category": envelope.category.to_string(),
"severity": envelope.severity.to_string(),
}));
step_error_count += 1;
step_error_categories.push(envelope.category);
if corrected {
// Already handled above — skip error recording.
} else {
let envelope: ErrorEnvelope = e.clone().into();
emit_tool_audit(json!({
"event": "tool.result",
"tool_id": outcome.id.clone(),
"tool_name": outcome.name.clone(),
"success": false,
"error": e.to_string(),
"category": envelope.category.to_string(),
"severity": envelope.severity.to_string(),
}));
step_error_count += 1;
step_error_categories.push(envelope.category);
}
match loop_guard.record_outcome(&outcome.name, false) {
OutcomeDecision::Continue => {}
OutcomeDecision::Warn(message) => {
crate::logging::warn(message.clone());
let _ = self.tx_event.send(Event::status(message)).await;
}
OutcomeDecision::Halt(message) => {
loop_guard_halt.get_or_insert(message);
}
}
let envelope: ErrorEnvelope = e.clone().into();
emit_tool_audit(json!({
"event": "tool.result",
"tool_id": outcome.id.clone(),
"tool_name": outcome.name.clone(),
"success": false,
"error": e.to_string(),
"category": envelope.category.to_string(),
"severity": envelope.severity.to_string(),
}));
step_error_count += 1;
step_error_categories.push(envelope.category);

Fix in Devin

Comment on lines +2019 to +2032
if outcome.name == "edit_file"
&& verify::correct_edit_file_input(
&tool_input,
&error_str,
&self.session.workspace,
)
.is_some()
{
let corrected_input = verify::correct_edit_file_input(
&tool_input,
&error_str,
&self.session.workspace,
)
.unwrap();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 correct_edit_file_input called twice per failure — the function reads the file and runs fuzzy matching both times. Use if let Some(corrected_input) = … to compute it once.

Suggested change
if outcome.name == "edit_file"
&& verify::correct_edit_file_input(
&tool_input,
&error_str,
&self.session.workspace,
)
.is_some()
{
let corrected_input = verify::correct_edit_file_input(
&tool_input,
&error_str,
&self.session.workspace,
)
.unwrap();
if outcome.name == "edit_file"
&& let Some(corrected_input) = verify::correct_edit_file_input(
&tool_input,
&error_str,
&self.session.workspace,
)
{

Fix in Devin

Comment on lines +185 to +186
let elapsed_ms = started.elapsed().as_millis() as u64;
let _ = elapsed_ms;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Verification ledger always records elapsed_ms: 0started.elapsed() is computed inside run_verification and then discarded with let _ = elapsed_ms;. The VerifyRecord at the call site in turn_loop.rs then hardcodes elapsed_ms: 0. Either return the elapsed duration from run_verification or measure it at the call site.

Suggested change
let elapsed_ms = started.elapsed().as_millis() as u64;
let _ = elapsed_ms;
let elapsed_ms = started.elapsed().as_millis() as u64;

Fix in Devin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants