Skip to content

Commit 9ed18e1

Browse files
committed
docs: add hand-off test plan for user-generated skills
Standalone walkthrough at docs/TESTING-USER-SKILLS.md covering smoke test, full workout, override behaviour, boundary cases, and likely failure modes. Intended to be passed to reviewers / testers who want to exercise the feature without reading the implementation. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
1 parent 2a6fb86 commit 9ed18e1

1 file changed

Lines changed: 193 additions & 0 deletions

File tree

docs/TESTING-USER-SKILLS.md

Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
# Testing the User-Generated Skills Feature
2+
3+
A walkthrough for verifying the **user skills** feature end-to-end. The
4+
feature lets a user persist what HyperAgent learned in a session as a
5+
reusable skill at `~/.hyperagent/skills/<name>/SKILL.md`, surviving
6+
upgrades and overriding system skills with the same name.
7+
8+
Implemented on branch `feat-user-skills` (atop fix
9+
`fix-marked-v15-renderer` for the markdown renderer crash).
10+
11+
---
12+
13+
## Prerequisites
14+
15+
- A working HyperAgent checkout on the `feat-user-skills` branch
16+
- `just setup` already run (Rust addons built, deps installed) — see the
17+
project [README](../README.md) and [DEVELOPMENT.md](DEVELOPMENT.md)
18+
- A terminal where `just start` launches the agent successfully
19+
- A working GitHub Copilot login for the agent's LLM calls
20+
21+
---
22+
23+
## 1. Smoke Test (~2 minutes)
24+
25+
This is the minimum bar — if this works, the feature is wired up
26+
end-to-end.
27+
28+
```bash
29+
# Use a throwaway skills dir so you don't pollute ~/.hyperagent/skills/
30+
export HYPERAGENT_USER_SKILLS_DIR=/tmp/ha-skills-test
31+
mkdir -p "$HYPERAGENT_USER_SKILLS_DIR"
32+
33+
just start
34+
```
35+
36+
In the agent REPL:
37+
38+
```text
39+
> /skills
40+
```
41+
42+
Confirms baseline — only **system** skills should appear, none with the
43+
👤 (user) badge.
44+
45+
Now do some work the agent will remember:
46+
47+
```text
48+
> use the fetch plugin to grab https://example.com and tell me the title
49+
```
50+
51+
Let it run to completion. Then ask the agent to save what it learned:
52+
53+
```text
54+
> /save-skill fetch-page-title
55+
```
56+
57+
**Expected behaviour:**
58+
59+
1. The agent receives a synthetic prompt summarising the session
60+
context (tools used, MCP servers, modules registered, recent errors)
61+
2. The LLM calls the `generate_skill(...)` tool
62+
3. You see an interactive approval prompt with a preview of the
63+
`SKILL.md` content
64+
4. Hit `y` to approve
65+
66+
Verify the file landed on disk:
67+
68+
```bash
69+
cat /tmp/ha-skills-test/fetch-page-title/SKILL.md
70+
```
71+
72+
You should see a valid SKILL.md with YAML frontmatter (`name`,
73+
`description`, `triggers`, etc.) and a markdown guidance body.
74+
75+
If that file exists, **the feature works.** 🎉
76+
77+
---
78+
79+
## 2. Full Workout
80+
81+
Exercise every command path. From a fresh `just start`:
82+
83+
```text
84+
> /skills # list both system + user skills
85+
> /skills info code-review # show full detail for a system skill
86+
> /save-skill # no name → LLM picks one
87+
> /skills # user skill now shows with 👤
88+
> /skills info fetch-page-title # user skill detail
89+
> /skills edit fetch-page-title # opens $EDITOR for hand-tuning
90+
> exit
91+
```
92+
93+
Then restart the agent and repeat the original task — the matching
94+
`/suggest_approach` should surface the saved skill via its triggers.
95+
96+
---
97+
98+
## 3. Override Test
99+
100+
User skills must override system skills with the same name. Drop a user
101+
skill that shadows an existing system one:
102+
103+
```bash
104+
mkdir -p "$HYPERAGENT_USER_SKILLS_DIR/code-review"
105+
cat > "$HYPERAGENT_USER_SKILLS_DIR/code-review/SKILL.md" << 'EOF'
106+
---
107+
name: code-review
108+
description: My customised code review skill
109+
triggers: [review, audit]
110+
---
111+
This overrides the system version.
112+
EOF
113+
114+
just start
115+
```
116+
117+
In the REPL:
118+
119+
```text
120+
> /skills info code-review
121+
```
122+
123+
**Expected:** the **user** description ("My customised code review
124+
skill") appears, and an **override** flag/note is present.
125+
126+
---
127+
128+
## 4. Negative / Boundary Tests
129+
130+
Validation should reject bad input cleanly without crashing the agent:
131+
132+
| Input | Expected outcome |
133+
|-------|------------------|
134+
| `/save-skill BadName` | Rejected — not kebab-case |
135+
| `/save-skill ../escape` | Rejected — path traversal |
136+
| `/save-skill thisnameisreallylongandshouldfailitsbeyondsixtyfourcharactersnowforsure` | Rejected — exceeds 64 chars |
137+
| `/save-skill fetch-page-title` (second time) | Overwrite confirmation prompt |
138+
139+
---
140+
141+
## 5. Cleanup
142+
143+
```bash
144+
rm -rf /tmp/ha-skills-test
145+
unset HYPERAGENT_USER_SKILLS_DIR
146+
```
147+
148+
---
149+
150+
## Verification Checklist
151+
152+
| Symptom | Confirms |
153+
|---------|----------|
154+
| `generate_skill` appears in the tool log after `/save-skill` | LLM picked up the system-message guidance ✅ |
155+
| Approval prompt shows a skill preview | Tool handler validation working ✅ |
156+
| `.md` file lands on disk under `$HYPERAGENT_USER_SKILLS_DIR` | `writeUserSkill()` working ✅ |
157+
| `/skills` shows the 👤 badge for the new skill | Multi-dir loader + `source` field working ✅ |
158+
| `/skills info <name>` shows the override flag for shadowed system skills | Name-collision detection working ✅ |
159+
| Restarting the agent matches the skill on similar prompts | `loadSkillsFromDirs` + boot wiring working ✅ |
160+
161+
---
162+
163+
## Likely Failure Modes & Where to Look
164+
165+
- **`/save-skill` runs but the LLM never calls `generate_skill`** — the
166+
synthetic prompt from `submitToLLM` may be too weak. See
167+
[src/agent/slash-commands.ts](../src/agent/slash-commands.ts) (the
168+
`/save-skill` handler) and
169+
[src/agent/system-message.ts](../src/agent/system-message.ts)
170+
("SAVING WHAT YOU LEARN" section).
171+
- **Tool not allowed** — every new tool needs registration at THREE
172+
points: `tools[]` array, `ALLOWED_TOOLS` in
173+
[src/agent/tool-gating.ts](../src/agent/tool-gating.ts), and
174+
`availableTools[]` in the session config. Triple-check.
175+
- **File written but `/skills` doesn't list it**
176+
`loadSkillsFromDirs()` in
177+
[src/agent/skill-loader.ts](../src/agent/skill-loader.ts) may not be
178+
reading the user dir. Verify `skillDirectories` in
179+
[src/agent/index.ts](../src/agent/index.ts) includes
180+
`getUserSkillsDir()`.
181+
182+
---
183+
184+
## Reporting Results
185+
186+
If something doesn't work, please capture:
187+
188+
1. The full agent REPL transcript
189+
2. Contents of `$HYPERAGENT_USER_SKILLS_DIR` after the test (`ls -laR`)
190+
3. The agent's debug log (`~/.hyperagent/logs/debug-*.log`)
191+
4. The output of `just check` from the same checkout
192+
193+
…and share with the implementer. Good hunting. 🎯

0 commit comments

Comments
 (0)