Skip to content

feat: redesign discover & gain reports with command grouping#550

Open
bz00qa wants to merge 8 commits intortk-ai:developfrom
bz00qa:feat/discover-report-redesign
Open

feat: redesign discover & gain reports with command grouping#550
bz00qa wants to merge 8 commits intortk-ai:developfrom
bz00qa:feat/discover-report-redesign

Conversation

@bz00qa
Copy link
Copy Markdown

@bz00qa bz00qa commented Mar 12, 2026

Summary

Discover Report Redesign

  • Colorized discover report with TTY-aware table formatting, styled commands, and ANSI-safe column alignment (pad_right/pad_left helpers)
  • Top token consumers section showing which commands use the most tokens, with smart base_cmd grouping (aggregates by first 3 words instead of exact command string)
  • Pattern detection for common usage patterns: git context (status+diff+log), watch loops (repeated commands), and dedup opportunities (large outputs)
  • Smart grouping: consumer_base() handles special patterns like pnpm --filter, python -m, keeping semantically related commands together

Gain Report Enhancements

  • Command Coverage section showing used/available commands per category (9 categories) with unused command hints
  • Category column added to "By Command (top 10)" table
  • Routing Breakdown section (dedicated filters vs proxy vs other)
  • JSON export includes coverage data (ExportCoverage, ExportCategoryStats)
  • COMMAND_REGISTRY constant mapping all RTK commands to categories

Tracking DB Improvements

  • base_cmd column with migration + backfill for command aggregation
  • new_in_memory() for test isolation
  • effective_idx logic for correct output attribution in command chains
  • get_by_command_all() for unlimited command grouping

Changes

File Description
src/discover/report.rs New PatternOpportunity and TokenConsumer types, colorized output helpers, redesigned format_text() with pattern/consumer sections, 7 new tests
src/discover/mod.rs Consumer map tracking, detect_patterns(), normalize_cmd_base(), consumer_base(), effective_idx logic
src/tracking.rs base_cmd column migration, extract_base_cmd(), new_in_memory(), get_by_command_all(), refactored tests to in-memory DB
src/gain.rs COMMAND_REGISTRY, category column, print_command_coverage(), print_routing_breakdown(), coverage JSON export, 6 new tests

Test plan

  • All 900 tests pass (1 pre-existing binlog CRLF failure on Windows)
  • cargo fmt --all --check passes
  • New tests: table alignment, pad helpers, in-memory DB, base_cmd extraction, registry validation, category lookup, coverage display
  • Manual: rtk discover and rtk gain on a project with Claude Code history

🤖 Generated with Claude Code

@bz00qa bz00qa changed the title feat: redesign discover report with command grouping feat: redesign discover & gain reports with command grouping Mar 12, 2026
@aeppling aeppling added the wrong-base PR targets master instead of develop label Mar 13, 2026
@aeppling
Copy link
Copy Markdown
Contributor

Hello,

Thanks for contributing

Please follow CONTRIBUTING.md guidelines, your branch should be created from develop and then merge into it, not master.

@aeppling
Copy link
Copy Markdown
Contributor

aeppling commented Mar 13, 2026

Full review v1

Contribution Guidelines

  1. Wrong target

Feature branch should derive from develop and target develop for PR

  1. Missing Documentation Updates

CONTRIBUTING.md requires documentation updates for changes of this scope:

  • CHANGELOG.md — not updated (required for new features)
  • README.md — not updated (new gain sections, coverage reporting)
  • ARCHITECTURE.md — not updated (new base_cmd column, pattern detection system, command registry)

OUT OF SCOPE

The PR description says "redesign discover & gain reports," but the actual changes include several orthogonal behavioral changes that should be separate PRs:

  1. Verbose Flag Stripping (registry.rs)

strip_verbose_flags() silently removes -v, -vv, --verbose, --debug from rewritten commands. This is a destructive
behavioral change to the rewrite hook affecting ALL users.

  1. Proxy Routing for Unsupported Commands (registry.rs)

The PR changes unsupported commands from returning None (pass-through unchanged) to Some("rtk proxy -f "):

// Before: htop -> None (executed as-is)
// After: htop -> Some("rtk proxy -f htop")

This is a significant behavioral change: every single unrecognized command now gets wrapped in rtk proxy -f. This
affects the rewrite hook for all users and should be its own PR with benchmarking (adds overhead to every unrecognized
command).

  1. Massively Expanded Subcommand Recognition (rules.rs) — Duplicates PR feat(discover): expand subcommand coverage and strip verbose flags #549

Code Quality

  1. detect_patterns() References Non-Existent Commands

suggestion: "rtk context".to_string(), // Doesn't exist
suggestion: format!("rtk watch {}", cmd), // Doesn't exist
suggestion: format!("rtk dedup {}", cmd), // Doesn't exist

These commands are proposed in PR #553 (not merged). The discover report will suggest commands users can't actually
run. This creates user confusion. At minimum, the suggestions should note these are planned/future features, or this
code should depend on PR #553 merging first.

  1. COMMAND_REGISTRY References Non-Existent Commands (gain.rs)

("context", "Meta Commands"), // Doesn't exist
("dedup", "Meta Commands"), // Doesn't exist
("watch", "Meta Commands"), // Doesn't exist

Same issue — the gain coverage report will show these as "available" commands.

Missing Tests

  1. No Tests for consumer_base() — Critical

This function handles complex edge cases (python -m, pnpm --filter, arg-only commands, flag commands) but has zero
unit tests. Given the complexity, this needs:

  • Basic 2-word grouping test
  • python -m module test
  • pnpm --filter skipping test
  • ARG_COMMANDS grouping test
  • FLAG_COMMANDS with/without flags test
  • Empty input test
  1. No Tests for normalize_cmd_base() — Critical

Another complex function with zero tests. Needs:

  • Skip prefix test
  • RTK prefix stripping test
  • Pure env assignment test
  • Normal 2-word extraction test
  1. No Tests for detect_patterns() — Major

The entire pattern detection system (context/watch/dedup) has no tests. This is a core feature of the PR with complex
logic (windowed analysis, threshold-based detection, cross-session aggregation).

  1. No Tests for effective_idx Logic — Major

The rposition() logic for attributing output to the last non-ignored command in a chain is untested.

Duplicated code

1- Bun/Deno/Bunx Rules (rules.rs) — Duplicates PR #548

PR #550 adds 3 new regex patterns and 3 full RtkRule entries for bun, bunx, and deno in rules.rs:

// PR #550 adds these PATTERNS (rules.rs line 1763-1765 of diff):
r"^bun\s+(install|add|remove|run|build|test|dev|start|pm|create|upgrade|link|unlink|patch|exec|repl|init|outdated|publ
ish|x)",
r"^bunx\s+",
r"^deno\s+(run|test|check|lint|fmt|compile|install|task|bench|doc|eval|info|cache|upgrade|uninstall|publish|add|remove
|init)",

// PR #550 adds these RULES (rules.rs lines 1959-2021 of diff):
RtkRule { rtk_cmd: "rtk bun", savings_pct: 75.0, subcmd_status: 13 entries, ... }
RtkRule { rtk_cmd: "rtk bunx", savings_pct: 70.0, ... }
RtkRule { rtk_cmd: "rtk deno", savings_pct: 70.0, subcmd_status: 12 entries, ... }

PR #548 ("feat: add Bun and Deno runtime support") by the same author adds:

  • src/bun_cmd.rs (261 lines) — actual filter implementation with 6 subcommands
  • src/deno_cmd.rs (254 lines) — actual filter implementation with 5 subcommands
  • src/main.rs — command routing (Commands enum, dispatch)
  • src/discover/rules.rs — the same 3 patterns + 3 rules

@aeppling aeppling self-assigned this Mar 13, 2026
bz00qa and others added 7 commits March 13, 2026 17:33
…ction

- Add colorized TTY-aware discover report with table formatting
- Add top token consumers section with smart base_cmd grouping
- Add pattern detection (git context, watch loops, dedup opportunities)
- Add base_cmd column to tracking DB with migration + backfill
- Add effective_idx logic for correct output_len attribution in chains
- Add consumer_base() for smart grouping (handles pnpm --filter, python -m)
- Add pad_right/pad_left helpers for ANSI-safe column alignment
- Add in-memory SQLite (new_in_memory) for test isolation
- Refactor all tracking tests to use in-memory DB

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
- Add COMMAND_REGISTRY mapping commands to categories (9 categories)
- Add Category column to "By Command" table
- Add "Command Coverage" section showing used/available per category
- Add "Routing Breakdown" section (dedicated vs proxy vs other)
- Add coverage data to JSON export (ExportCoverage, ExportCategoryStats)
- Add lookup_category(), print_command_coverage(), print_routing_breakdown()
- Add 6 tests (registry validation, category lookup, coverage display)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
extract_base_cmd now strips trailing _\d+ numeric suffixes from command
words, so test-generated entries like cmd1_test_19952 and cmd1_test_24604
are grouped as "rtk cmd1_test" in analytics. Includes migration to
re-normalize existing base_cmd values.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
Dynamic column widths for Count/Total/Avg in TOP TOKEN CONSUMERS table.
consumer_base() no longer includes file paths for arg-only commands
(cat, find, grep, ls).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
classify_command() regexes require args (e.g. "find\s+"), so bare
command names like "find" or "cat" from consumer_base() didn't match.
Now also tries classification with a dummy arg appended.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
- Add cargo run/init/new/fix/audit to cargo regex
- Add gh auth/workflow/gist/status/browse + more to gh regex
- Handle git -c key=val before subcommand
- cat/head/tail with redirects now Ignored instead of Unsupported

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
Replace boolean has_rtk_filter with RtkHandling enum to distinguish
between commands with dedicated filters (Yes), passthrough support
(Pass), and no RTK handling (No) in the consumers table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: bz00qa <bz00qa.hub@gmail.com>
@bz00qa bz00qa force-pushed the feat/discover-report-redesign branch from 0fab6fc to 48fd7b0 Compare March 13, 2026 16:34
@bz00qa bz00qa changed the base branch from master to develop March 13, 2026 16:34
…docs and tests

Remove scope creep from PR rtk-ai#550:
- Restore rules.rs/registry.rs to develop baseline, keep only in-scope changes
- Remove strip_verbose_flags(), proxy -f routing, expanded subcommands
- Remove bun/deno/bunx from COMMAND_REGISTRY (unmerged PR rtk-ai#548)
- Remove context/watch/dedup from COMMAND_REGISTRY (non-existent PR rtk-ai#553)
- Remove detect_patterns() and normalize_cmd_base() (deferred)
- Remove PatternOpportunity from report

Add missing tests: consumer_base(), effective_idx
Add docs: CHANGELOG.md, README.md gain section, ARCHITECTURE.md discover module
Fix clippy warnings in new test code (vec![] → array, single_match)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bz00qa
Copy link
Copy Markdown
Author

bz00qa commented Mar 13, 2026

thx for the detailed feedback, i have tried to respect all those mentioned and updated the PR accordingly

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@aeppling
Copy link
Copy Markdown
Contributor

Hey

We are cleaning up the codebase and improving the project structure for better onboarding. As part of this effort, PR #826 reorganizes src/ from a flat layout into subfolders.

No logic changes — only file moves and import path updates.

What you need to do

Rebase your branch on develop when receiving this comment:

git fetch origin && git rebase origin/develop

Git detects renames automatically. If you get import conflicts, update the paths:

use crate::git;        // now: use crate::cmds::git::git;
use crate::tracking;   // now: use crate::core::tracking;
use crate::config;     // now: use crate::core::config;
use crate::init;       // now: use crate::hooks::init;
use crate::gain;       // now: use crate::analytics::gain;

Need help rebasing? Tag @aeppling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-changes wrong-base PR targets master instead of develop

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants