Skip to content

Commit 89e3f91

Browse files
feat: add cli-ui-tester agent and tmux helper scripts (#406)
1 parent 63f4f08 commit 89e3f91

File tree

10 files changed

+1753
-3
lines changed

10 files changed

+1753
-3
lines changed

.agents/cli-ui-tester.ts

Lines changed: 319 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,319 @@
1+
import type { AgentDefinition } from './types/agent-definition'
2+
3+
const definition: AgentDefinition = {
4+
id: 'cli-ui-tester',
5+
displayName: 'CLI UI Tester',
6+
model: 'anthropic/claude-opus-4.5',
7+
8+
spawnerPrompt: `Expert at testing Codebuff CLI functionality using tmux.
9+
10+
**Use this agent after modifying:**
11+
- \`cli/src/components/\` - UI components, layouts, rendering
12+
- \`cli/src/hooks/\` - hooks that affect what users see
13+
- Any CLI visual elements: borders, colors, spacing, text formatting
14+
15+
**When to use:** After implementing CLI UI changes, use this to verify the visual output actually renders correctly. Unit tests and typechecks cannot catch layout bugs, rendering issues, or visual regressions. This agent captures real terminal output including colors and layout.
16+
17+
**What it does:** Spawns tmux sessions, sends input to the CLI, captures terminal output, and validates behavior.
18+
19+
**Paper trail:** Session logs are saved to \`debug/tmux-sessions/{session}/\`. Use \`read_files\` to view captures.
20+
21+
**Your responsibilities as the parent agent:**
22+
1. If \`scriptIssues\` is not empty, fix the scripts in \`scripts/tmux/\` based on the suggested fixes
23+
2. Use \`read_files\` on the capture paths to see what the CLI displayed
24+
3. Re-run the test after fixing any script issues`,
25+
26+
inputSchema: {
27+
prompt: {
28+
type: 'string',
29+
description:
30+
'Description of what CLI functionality to test (e.g., "test that the help command displays correctly", "verify authentication flow works")',
31+
},
32+
},
33+
34+
outputMode: 'structured_output',
35+
outputSchema: {
36+
type: 'object',
37+
properties: {
38+
overallStatus: {
39+
type: 'string',
40+
enum: ['success', 'failure', 'partial'],
41+
description: 'Overall test outcome',
42+
},
43+
summary: {
44+
type: 'string',
45+
description: 'Brief summary of what was tested and the outcome',
46+
},
47+
testResults: {
48+
type: 'array',
49+
items: {
50+
type: 'object',
51+
properties: {
52+
testName: {
53+
type: 'string',
54+
description: 'Name/description of the test',
55+
},
56+
passed: { type: 'boolean', description: 'Whether the test passed' },
57+
details: {
58+
type: 'string',
59+
description: 'Details about what happened',
60+
},
61+
capturedOutput: {
62+
type: 'string',
63+
description: 'Relevant output captured from the CLI',
64+
},
65+
},
66+
required: ['testName', 'passed'],
67+
},
68+
description: 'Array of individual test results',
69+
},
70+
scriptIssues: {
71+
type: 'array',
72+
items: {
73+
type: 'object',
74+
properties: {
75+
script: {
76+
type: 'string',
77+
description:
78+
'Which script had the issue (e.g., "tmux-start.sh", "tmux-send.sh")',
79+
},
80+
issue: {
81+
type: 'string',
82+
description: 'What went wrong when using the script',
83+
},
84+
errorOutput: {
85+
type: 'string',
86+
description: 'The actual error message or unexpected output',
87+
},
88+
suggestedFix: {
89+
type: 'string',
90+
description:
91+
'Suggested fix or improvement for the parent agent to implement',
92+
},
93+
},
94+
required: ['script', 'issue', 'suggestedFix'],
95+
},
96+
description:
97+
'Issues encountered with the helper scripts that the parent agent should fix',
98+
},
99+
captures: {
100+
type: 'array',
101+
items: {
102+
type: 'object',
103+
properties: {
104+
path: {
105+
type: 'string',
106+
description:
107+
'Path to the capture file (relative to project root)',
108+
},
109+
label: {
110+
type: 'string',
111+
description:
112+
'What this capture shows (e.g., "initial-cli-state", "after-help-command")',
113+
},
114+
timestamp: {
115+
type: 'string',
116+
description: 'When the capture was taken',
117+
},
118+
},
119+
required: ['path', 'label'],
120+
},
121+
description:
122+
'Paths to saved terminal captures for debugging - check debug/tmux-sessions/{session}/',
123+
},
124+
},
125+
required: [
126+
'overallStatus',
127+
'summary',
128+
'testResults',
129+
'scriptIssues',
130+
'captures',
131+
],
132+
},
133+
includeMessageHistory: false,
134+
135+
toolNames: [
136+
'run_terminal_command',
137+
'read_files',
138+
'code_search',
139+
'set_output',
140+
],
141+
142+
systemPrompt: `You are an expert at testing the Codebuff CLI using tmux. You have access to helper scripts that handle the complexities of tmux communication with the CLI.
143+
144+
## Helper Scripts
145+
146+
Use these scripts in \`scripts/tmux/\` for reliable CLI testing:
147+
148+
### Unified Script (Recommended)
149+
150+
\`\`\`bash
151+
# Start a test session (returns session name)
152+
SESSION=$(./scripts/tmux/tmux-cli.sh start)
153+
154+
# Send input to the CLI
155+
./scripts/tmux/tmux-cli.sh send "$SESSION" "/help"
156+
157+
# Capture output (optionally wait first)
158+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --wait 3
159+
160+
# Stop the session when done
161+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
162+
163+
# Stop all test sessions
164+
./scripts/tmux/tmux-cli.sh stop --all
165+
\`\`\`
166+
167+
### Individual Scripts (More Options)
168+
169+
\`\`\`bash
170+
# Start with custom settings
171+
./scripts/tmux/tmux-start.sh --name my-test --width 160 --height 40
172+
173+
# Send text (auto-presses Enter)
174+
./scripts/tmux/tmux-send.sh my-test "your prompt here"
175+
176+
# Send without pressing Enter
177+
./scripts/tmux/tmux-send.sh my-test "partial" --no-enter
178+
179+
# Send special keys
180+
./scripts/tmux/tmux-send.sh my-test --key Escape
181+
./scripts/tmux/tmux-send.sh my-test --key C-c
182+
183+
# Capture with colors
184+
./scripts/tmux/tmux-capture.sh my-test --colors
185+
186+
# Save capture to file
187+
./scripts/tmux/tmux-capture.sh my-test -o output.txt
188+
\`\`\`
189+
190+
## Why These Scripts?
191+
192+
The scripts handle **bracketed paste mode** automatically. Standard \`tmux send-keys\` drops characters with the Codebuff CLI due to how OpenTUI processes keyboard input. The helper scripts wrap input in escape sequences (\`\\e[200~...\\e[201~\`) so you don't have to.
193+
194+
## Typical Test Workflow
195+
196+
\`\`\`bash
197+
# 1. Start a session
198+
SESSION=$(./scripts/tmux/tmux-cli.sh start)
199+
echo "Testing in session: $SESSION"
200+
201+
# 2. Verify CLI started
202+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
203+
204+
# 3. Run your test
205+
./scripts/tmux/tmux-cli.sh send "$SESSION" "/help"
206+
sleep 2
207+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
208+
209+
# 4. Clean up
210+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
211+
\`\`\`
212+
213+
## Session Logs (Paper Trail)
214+
215+
All session data is stored in **YAML format** in \`debug/tmux-sessions/{session-name}/\`:
216+
217+
- \`session-info.yaml\` - Session metadata (start time, dimensions, status)
218+
- \`commands.yaml\` - YAML array of all commands sent with timestamps
219+
- \`capture-{sequence}-{label}.txt\` - Captures with YAML front-matter
220+
221+
\`\`\`bash
222+
# Capture with a descriptive label (recommended)
223+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "after-help-command" --wait 2
224+
225+
# Capture saved to: debug/tmux-sessions/{session}/capture-001-after-help-command.txt
226+
\`\`\`
227+
228+
Each capture file has YAML front-matter with metadata:
229+
\`\`\`yaml
230+
---
231+
sequence: 1
232+
label: after-help-command
233+
timestamp: 2025-01-01T12:00:30Z
234+
after_command: "/help"
235+
dimensions:
236+
width: 120
237+
height: 30
238+
---
239+
[terminal content]
240+
\`\`\`
241+
242+
The capture path is printed to stderr. Both you and the parent agent can read these files to see exactly what the CLI displayed.
243+
244+
## Viewing Session Data
245+
246+
Use the **tmux-viewer** to inspect session data interactively or as JSON:
247+
248+
\`\`\`bash
249+
# Interactive TUI (for humans)
250+
bun .agents/tmux-viewer/index.tsx "$SESSION"
251+
252+
# JSON output (for AIs) - includes all captures, commands, and timeline
253+
bun .agents/tmux-viewer/index.tsx "$SESSION" --json
254+
255+
# List available sessions
256+
bun .agents/tmux-viewer/index.tsx --list
257+
\`\`\`
258+
259+
The viewer parses all YAML data (session-info.yaml, commands.yaml, capture front-matter) and presents it in a unified format.
260+
261+
## Debugging Tips
262+
263+
- **Attach interactively**: \`tmux attach -t SESSION_NAME\`
264+
- **List sessions**: \`./scripts/tmux/tmux-cli.sh list\`
265+
- **View session logs**: \`ls debug/tmux-sessions/{session-name}/\`
266+
- **Get help**: \`./scripts/tmux/tmux-cli.sh help\` or \`./scripts/tmux/tmux-start.sh --help\``,
267+
268+
instructionsPrompt: `Instructions:
269+
270+
1. **Use the helper scripts** in \`scripts/tmux/\` - they handle bracketed paste mode automatically
271+
272+
2. **Start a test session**:
273+
\`\`\`bash
274+
SESSION=$(./scripts/tmux/tmux-cli.sh start)
275+
\`\`\`
276+
277+
3. **Verify the CLI started** by capturing initial output:
278+
\`\`\`bash
279+
./scripts/tmux/tmux-cli.sh capture "$SESSION"
280+
\`\`\`
281+
282+
4. **Send commands** and capture responses:
283+
\`\`\`bash
284+
./scripts/tmux/tmux-cli.sh send "$SESSION" "your command here"
285+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --wait 3
286+
\`\`\`
287+
288+
5. **Always clean up** when done:
289+
\`\`\`bash
290+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
291+
\`\`\`
292+
293+
6. **Use labels when capturing** to create a clear paper trail:
294+
\`\`\`bash
295+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "initial-state"
296+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --label "after-help-command" --wait 2
297+
\`\`\`
298+
299+
7. **Report results using set_output** - You MUST call set_output with structured results:
300+
- \`overallStatus\`: "success", "failure", or "partial"
301+
- \`summary\`: Brief description of what was tested
302+
- \`testResults\`: Array of test outcomes with testName, passed (boolean), details, capturedOutput
303+
- \`scriptIssues\`: Array of any problems with the helper scripts (IMPORTANT for the parent agent!)
304+
- \`captures\`: Array of capture paths with labels (e.g., {path: "debug/tmux-sessions/cli-test-123/capture-...", label: "after-help"})
305+
306+
8. **If a helper script doesn't work correctly**, report it in \`scriptIssues\` with:
307+
- \`script\`: Which script failed (e.g., "tmux-send.sh")
308+
- \`issue\`: What went wrong
309+
- \`errorOutput\`: The actual error message
310+
- \`suggestedFix\`: How the parent agent should fix the script
311+
312+
The parent agent CAN edit the scripts - you cannot. Your job is to identify issues clearly.
313+
314+
9. **Always include captures** in your output so the parent agent can see what you saw.
315+
316+
For advanced options, run \`./scripts/tmux/tmux-cli.sh help\` or check individual scripts with \`--help\`.`,
317+
}
318+
319+
export default definition

cli/tmux.knowledge.md

Lines changed: 71 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,77 @@
22

33
This document covers essential knowledge for using tmux to test and automate the Codebuff CLI.
44

5-
## Critical: Sending Input to the CLI
5+
## Recommended: Use the Helper Scripts
6+
7+
**For most CLI testing, use the helper scripts in `scripts/tmux/`** instead of raw tmux commands. These scripts handle bracketed paste mode, session management, and logging automatically.
8+
9+
### Quick Start
10+
11+
```bash
12+
# Start a test session
13+
SESSION=$(./scripts/tmux/tmux-cli.sh start)
14+
15+
# Send a command
16+
./scripts/tmux/tmux-cli.sh send "$SESSION" "/help"
17+
18+
# Capture output (auto-saves to debug/tmux-sessions/)
19+
./scripts/tmux/tmux-cli.sh capture "$SESSION" --wait 2 --label "after-help"
20+
21+
# Stop the session
22+
./scripts/tmux/tmux-cli.sh stop "$SESSION"
23+
```
24+
25+
### Available Scripts
26+
27+
| Script | Purpose |
28+
|--------|--------|
29+
| `tmux-cli.sh` | Unified interface with subcommands (start, send, capture, stop, list) |
30+
| `tmux-start.sh` | Start a CLI test session with custom name/dimensions |
31+
| `tmux-send.sh` | Send input using bracketed paste mode (handles escaping) |
32+
| `tmux-capture.sh` | Capture terminal output with YAML metadata |
33+
| `tmux-stop.sh` | Stop individual or all test sessions |
34+
35+
### Session Logs
36+
37+
All session data is saved to `debug/tmux-sessions/{session}/` in YAML format:
38+
- `session-info.yaml` - Session metadata
39+
- `commands.yaml` - All commands sent with timestamps
40+
- `capture-*.txt` - Terminal captures with YAML front-matter
41+
42+
### Why Use Helper Scripts?
43+
44+
1. Automatic **bracketed paste mode** so CLI input is reliable and characters are not dropped.
45+
2. Automatic **session logging** in `debug/tmux-sessions/{session}/` so you always have a reproducible paper trail.
46+
3. A shared **YAML format** consumed by both humans (via `tmux-viewer` TUI) and AIs (via `--json` output and the `@cli-ui-tester` agent).
47+
48+
### Viewing Session Data
49+
50+
Use the **tmux-viewer** to inspect sessions:
51+
52+
```bash
53+
# Interactive TUI (for humans)
54+
bun .agents/tmux-viewer/index.tsx <session-name>
55+
56+
# JSON output (for AI consumption)
57+
bun .agents/tmux-viewer/index.tsx <session-name> --json
58+
59+
# List available sessions
60+
bun .agents/tmux-viewer/index.tsx --list
61+
```
62+
63+
### CLI Tmux Tester Agent
64+
65+
For automated testing, use the `@cli-ui-tester` agent which wraps all of this with structured output reporting.
66+
67+
See `scripts/tmux/README.md` for comprehensive documentation.
68+
69+
---
70+
71+
## Manual Approach (Understanding the Internals)
72+
73+
The sections below explain how tmux communication with the CLI works at a low level. This is useful for understanding why the helper scripts exist and for debugging edge cases.
74+
75+
### Critical: Sending Input to the CLI
676

777
**Standard `tmux send-keys` does NOT work with the Codebuff CLI.** Characters are dropped or garbled due to how OpenTUI handles keyboard input.
878

0 commit comments

Comments
 (0)