-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Objective
Create a separate OpenTUI-based terminal dashboard app for AgentV that provides a keyboard-first local UI for browsing run history, live eval progress, and result summaries without expanding the core evaluation runtime.
Architecture Boundary
external-first
This should live as a separate app/package in the monorepo rather than inside the main CLI package runtime. The dashboard may consume existing AgentV artifacts and shared libraries, and the CLI may later provide a thin launcher entrypoint, but the TUI itself should remain isolated from core eval execution logic.
Design Latitude
Engineer may choose the exact package/app location and launch model, for example apps/tui with an optional agentv dashboard launcher in apps/cli.
Prefer reusing existing AgentV result/history abstractions and output artifacts over inventing a new plugin system. If a shared dashboard data/query layer is needed for both web and TUI surfaces, keep that shared layer UI-agnostic.
Acceptance Signals
- A dedicated OpenTUI app/package exists in the monorepo and can run independently of the main CLI package internals.
- The TUI can read existing AgentV run artifacts or the same history storage used by dashboard/reporting features.
- The TUI provides at least these core views:
- run list / history
- active run or recent run summary
- per-run detail view with pass/fail and score breakdown
- The UI is keyboard-first and usable entirely in the terminal.
- The implementation does not require introducing a general-purpose tokentop-style plugin runtime.
- Shared data-loading logic, if introduced, is reusable by other dashboard surfaces.
- Packaging/runtime assumptions specific to OpenTUI/Bun remain isolated from the main
agentvCLI package.
Non-Goals
- Replacing or merging with the web dashboard work in feat: self-hosted dashboard — historical trends, dataset management, YAML editor #563.
- Designing a general third-party plugin architecture for the dashboard.
- Moving core evaluation logic into the TUI package.
- Requiring the main CLI package to depend directly on OpenTUI internals.
- Full parity with the web dashboard on the first pass.
Related
- feat: self-hosted dashboard — historical trends, dataset management, YAML editor #563 self-hosted dashboard — web/local browser dashboard
- Interactive eval TUI should list available models by provider #520 interactive eval TUI should list available models by provider
- Research: tokentop OpenTUI and plugin/TUI architecture comparison in agentevals-research