Skip to content

fix(tui): clear stale tool failures from sidebar (#1884)#2258

Open
Hmbown wants to merge 2 commits into
mainfrom
fix/1884-sidebar-stale-failures-clean
Open

fix(tui): clear stale tool failures from sidebar (#1884)#2258
Hmbown wants to merge 2 commits into
mainfrom
fix/1884-sidebar-stale-failures-clean

Conversation

@Hmbown
Copy link
Copy Markdown
Owner

@Hmbown Hmbown commented May 27, 2026

Problem

The sidebar's editorial_tool_rows function never deduplicated or capped failed tool entries, causing old failures like 'gh issue create (failed)' to linger as the top-ranked items even after the task moved on or the same tool succeeded in a later attempt.

Fix

Three targeted changes:

  1. Deduplicate failures per tool name — keep only the most recent failure for each tool
  2. Clear stale failures on success — when a tool name appears with Success status, any prior Failure entry for that name is removed from the candidate list
  3. Cap visible failures at 2 — excess failures are demoted to rank 3 (low-priority) so they don't crowd out running tasks and recent successes

Verification

  • cargo check --workspace
  • cargo test -p codewhale-tui ✅ — 3,449 passed, 0 failed

Closes #1884


Open in Devin Review

Greptile Summary

This PR fixes stale tool-failure entries lingering in the TUI sidebar by introducing three targeted improvements to editorial_tool_rows: per-tool failure deduplication, clearing of stale failures when a newer success is seen (correctly gated on OldestFirst order), and a two-failure visibility cap that demotes excess entries to rank 3.

  • A new ToolRowOrder enum is threaded through both callers (active_tool_rows uses OldestFirst, recent_tool_rows uses NewestFirst) to guard the retain-on-success logic against removing still-fresh failures in the history path.
  • visible_failure_count is correctly decremented and seen_failures is cleared when the retain removes candidates, reclaiming failure slots for subsequent failures.
  • Three new unit tests cover the key ordering scenarios and the slot-reclaim behaviour.

Confidence Score: 5/5

Safe to merge; the changes are scoped to sidebar display logic and carry no risk of data loss or crashes.

All three issues flagged in previous review rounds (retain direction, seen_failures cleanup, visible_failure_count decrement) are correctly addressed. The only remaining gap is cosmetic: in the OldestFirst path, repeated failures for the same tool keep the oldest error summary rather than the most recent one. This does not break any existing behaviour.

crates/tui/src/tui/sidebar.rs — specifically the seen_failures deduplication block around the OldestFirst path.

Important Files Changed

Filename Overview
crates/tui/src/tui/sidebar.rs Adds failure deduplication, stale-failure clearing on success, and a visible-failure cap (MAX=2) to editorial_tool_rows. The retain/counter cleanup for OldestFirst is correct; a minor ordering issue causes the oldest failure summary (not the newest) to be kept when a tool fails multiple times consecutively in the OldestFirst path.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[editorial_tool_rows called] --> B{row.status == Failed?}
    B -- Yes --> C{NewestFirst AND tool already succeeded?}
    C -- Yes --> SKIP1[continue — stale failure dropped]
    C -- No --> D{seen_failures contains tool?}
    D -- Yes --> SKIP2[continue — duplicate failure dropped]
    D -- No --> E[push to seen_failures\napply failure_summary_with_hint]
    E --> F{visible_failure_count >= MAX 2?}
    F -- Yes --> G[rank = 3\ndemoted failure]
    F -- No --> H[visible_failure_count++\nrank = 0]
    B -- No --> I{row.status == Success AND OldestFirst?}
    I -- Yes --> J[candidates.retain: remove same-tool failures\ndecrement visible_failure_count\nclear seen_failures entry]
    I -- No --> K[rank = tool_row_rank]
    G --> PUSH[candidates.push]
    H --> PUSH
    J --> K
    K --> PUSH
    PUSH --> SORT[sort by rank, order\ntake limit]
    SORT --> OUT[final SidebarToolRow list]
Loading

Fix All in Devin

Reviews (2): Last reviewed commit: "fix(tui): make sidebar failure clearing ..." | Re-trigger Greptile

Deduplicate failures per tool name, clear stale failures when
the same tool succeeds, and cap visible failures at 2 so old
failures don't crowd the sidebar after the task moves on.
Copilot AI review requested due to automatic review settings May 27, 2026 05:05
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the sidebar tool row rendering logic to deduplicate and limit visible failures, preventing stale failures from crowding the sidebar. The reviewer identified two critical issues related to the processing order (most recent first): first, older failures should be skipped if a more recent success has already been seen; second, the candidates.retain block incorrectly removes more recent failures when processing older successes and should be removed.

Comment on lines 1164 to 1174
if row.status == ToolStatus::Failed {
// Deduplicate failures for the same tool name: keep only the most
// recent failure per tool. Fixes #1884 — stale failures from
// tools that have since succeeded no longer crowd the sidebar.
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key) {
continue;
}
seen_failures.push(fail_key);
row.summary = failure_summary_with_hint(&row.summary);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Because rows is processed in reverse chronological order (most recent first), the most recent executions are processed first. This means if a tool failed and then succeeded, the Success row is processed first, and the older Failure row is processed later.

To prevent the older Failure row from being added to candidates, we should check seen_tool_names_succeeded here and skip the failure if a more recent success has already been seen.

Suggested change
if row.status == ToolStatus::Failed {
// Deduplicate failures for the same tool name: keep only the most
// recent failure per tool. Fixes #1884 — stale failures from
// tools that have since succeeded no longer crowd the sidebar.
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key) {
continue;
}
seen_failures.push(fail_key);
row.summary = failure_summary_with_hint(&row.summary);
}
if row.status == ToolStatus::Failed {
// Deduplicate failures for the same tool name: keep only the most
// recent failure per tool. Fixes #1884 — stale failures from
// tools that have since succeeded no longer crowd the sidebar.
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key) || seen_tool_names_succeeded.contains(&fail_key) {
continue;
}
seen_failures.push(fail_key);
row.summary = failure_summary_with_hint(&row.summary);
}

Comment thread crates/tui/src/tui/sidebar.rs Outdated
Comment on lines 1234 to 1243
// When a tool succeeds, remove any prior failure for the same
// tool name. Prevents stale `gh issue create (failed)` from
// lingering after a successful retry. (#1884)
if row.status == ToolStatus::Success {
let normalized = row.name.trim().to_ascii_lowercase();
candidates.retain(|c| {
c.row.name.trim().to_ascii_lowercase() != normalized
|| c.row.status != ToolStatus::Failed
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This candidates.retain block is incorrect and should be removed. Because rows is processed most recent first, an older success is processed after a more recent failure. When that older success is processed, this block will incorrectly remove the more recent failure from candidates.

By removing this block and instead checking seen_tool_names_succeeded in the ToolStatus::Failed block, we correctly skip older failures when a more recent success exists, while preserving more recent failures when an older success exists.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1168 to +1172
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key) {
continue;
}
seen_failures.push(fail_key);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 seen_tool_names_succeeded is populated but never read, so failure suppression doesn't work when rows arrive newest-first

The variable seen_tool_names_succeeded is populated at lines 1228-1231 but is never read anywhere in the function. It was clearly intended to be checked in the failure-handling block (around line 1168-1170) to skip failures for tools that have already succeeded.

This matters because recent_tool_rows (line 836) passes rows in newest-first order via .rev(). When a success (newer) is processed before its corresponding older failure, the candidates.retain() at line 1239 is a no-op — the failure candidate hasn't been added yet, so there's nothing to remove. The missing seen_tool_names_succeeded check was supposed to handle exactly this case: when the loop later encounters the older failure, it should see that the tool already succeeded and skip it.

Without this check, the PR's stated goal (#1884 — clearing stale failures after a successful retry) does not work for the recent_tool_rows code path, which is the primary path for showing completed tool history.

Suggested change
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key) {
continue;
}
seen_failures.push(fail_key);
let fail_key = row.name.trim().to_ascii_lowercase();
if seen_failures.contains(&fail_key)
|| seen_tool_names_succeeded.contains(&fail_key)
{
continue;
}
seen_failures.push(fail_key);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread crates/tui/src/tui/sidebar.rs Outdated
Comment on lines 1239 to 1243
candidates.retain(|c| {
c.row.name.trim().to_ascii_lowercase() != normalized
|| c.row.status != ToolStatus::Failed
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 visible_failure_count not decremented when candidates.retain removes failure candidates

When candidates.retain at line 1239 removes a failure candidate (because a later success for the same tool was encountered), visible_failure_count is not decremented. This inflates the count, causing subsequent legitimate failures to be incorrectly demoted to rank 3 even when fewer than MAX_VISIBLE_FAILURES rank-0 failures remain in candidates. This only affects the active_tool_rows path (oldest-first order) where retain can actually remove failures.

Suggested change
candidates.retain(|c| {
c.row.name.trim().to_ascii_lowercase() != normalized
|| c.row.status != ToolStatus::Failed
});
}
let normalized = row.name.trim().to_ascii_lowercase();
let before = candidates.len();
candidates.retain(|c| {
c.row.name.trim().to_ascii_lowercase() != normalized
|| c.row.status != ToolStatus::Failed
});
let removed = before - candidates.len();
visible_failure_count = visible_failure_count.saturating_sub(removed);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread crates/tui/src/tui/sidebar.rs Outdated
Comment thread crates/tui/src/tui/sidebar.rs
Comment thread crates/tui/src/tui/sidebar.rs
Comment thread crates/tui/src/tui/sidebar.rs Outdated
@Hmbown
Copy link
Copy Markdown
Owner Author

Hmbown commented May 27, 2026

Independent review:

Scope is correct: "stale" is per-render-pass over RECENT_TOOL_SCAN_LIMIT history, not persistent — no cross-session leakage.

Real concerns:

  1. Tool-name-only matching is coarse. candidates.retain (line 1255-1265) removes a Failed candidate when any Success for the same row.name appears. For multi-invocation tools (e.g. shell with different args, or gh with different subcommands after trimming), a Success for invocation A can clear a still-relevant Failure for invocation B. sidebar_row_identity is used for Success dedup but not for this clear — inconsistent. Consider keying on identity, not just name.
  2. OldestFirst keeps oldest failure summary (Greptile flagged, still open) — repeated failures for same tool show stale error text. Minor.
  3. Tests miss the name-collision case above. The 3 added tests all use distinct tool names per failure.

No UI race (single synchronous pass, no clear-then-re-add visible to renderer). No conflict with #2256 v0.8.48 — that PR touches 60+ files but not sidebar.rs. Clean merge confirmed via git merge --no-commit --no-ff.

Net: ship-worthy fix for #1884; recommend follow-up to tighten the name-match heuristic before tools with shared names produce a regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TUI cockpit shows stale low-signal state and clipped recent-tool failures

2 participants