hotfix: plan-time retry on hallucinated skill names by AVADSA25 · Pull Request #41 · AVADSA25/codec

AVADSA25 · 2026-05-04T08:04:15Z

Reproducer (real, today)

User dropped:

"Read all markdown files in ~/codec-repo/docs/ and create an index.md that lists each file with its first heading and a one-line description"

Result:

❌ Plan failed: plan invalid: plan references unknown skills: ['file_read']

The user never even reached approve/reject. Plan validation rejected the draft because Qwen invented `file_read` (the actual skill is `file_ops`).

Why PR #35 didn't catch it

PR #35 added a single-shot correction-nudge retry inside `codec_agent_runner._execute_checkpoint` — that handles hallucinations at execution time (skill / write_path / read_path / domain). But validation lives earlier, in `codec_agent_plan.draft_plan` → `validate_plan_skills`. The plan is rejected before it's even saved, so the runner never gets a chance to retry.

Fix (mirrors PR #35 one layer up)

After `validate_plan_skills` returns `ok=False`, build a corrective prompt with:
- The missing skill names
- The FULL allowed registry list (closed-world choice)
- The three most common confusions: `file_read`→`file_ops`, `fetch_url`→`web_fetch`, `read_file`→`file_ops`
Re-call Qwen ONCE with the appended correction.
Re-validate. Success → use the corrected plan. Failure → raise with both attempts in the message so the user sees consistent confusion (vs a one-off transient miss).
If the retry call itself flakes, surface the original validation error.

Also strengthen `_PLAN_SYSTEM_PROMPT` with the same three confusion hints so the FIRST draft is more likely to succeed (cuts the retry rate).

Tests (3 new)

`test_draft_plan_retries_on_hallucinated_skill_then_succeeds` — reproduces the exact user case (file_read → file_ops)
`test_draft_plan_retry_also_fails_raises_with_both_attempts` — both attempts miss, message contains both
`test_draft_plan_retry_qwen_unavailable_surfaces_original_error` — retry call raises ConnectionError, original validation error surfaces

All 3 existing `draft_plan` tests still pass — backward-compat preserved. `test_draft_plan_rejects_unknown_skill` now exercises BOTH attempts (fake returns same bad plan twice) and still raises with the missing skill in the message.

Tally: 35/35 file tests pass + 7 pre-existing pynput env failures (unchanged on baseline, unrelated to this PR).

Test plan after merge

Drop the markdown-index project again — expected: plan succeeds with `file_ops`, agent runs, index.md written
Drop the forex briefing again — expected: same fix path also helps if Qwen ever drafts it with hallucinated names

🤖 Generated with Claude Code

User repro 2026-05-04 09:58: Project: "Read all markdown files in ~/codec-repo/docs/ and create an index.md that lists each file with its first heading and a one-line description" Result: Plan failed: plan invalid: plan references unknown skills: ['file_read'] Same hallucination CLASS as PR #35 but at a different LAYER. PR #35 fixed retries during execution (codec_agent_runner). This is failing earlier — at plan validation, before the plan is even saved. The user never even got to the approve-or-reject step. Root cause: Qwen drafts plans naming skills that don't exist. `file_read` and `fetch_url` are the two we've seen. The actual file-reading skill is `file_ops` (which reads, writes, appends, lists). The actual URL fetch is `web_fetch`. The user-visible result was the same as PR #35 — project mode dies before doing anything useful. Fix (mirrors PR #35's pattern at planning layer): 1. After validate_plan_skills returns ok=False, instead of raising, build a corrective prompt listing the missing skills, the FULL allowed registry, and the three most common confusions (file_read→file_ops, fetch_url→web_fetch, read_file→file_ops). 2. Re-call _qwen_chat ONCE with the appended correction. 3. Re-validate the second draft. If valid, use it. If not, raise with BOTH attempts in the message so the user sees the model is consistently confused (vs a one-off transient miss). 4. If the retry call itself fails (Qwen flakes between attempts), raise with the ORIGINAL validation error — more diagnostic than "qwen flaked on retry". Also: - Strengthen _PLAN_SYSTEM_PROMPT with the same three confusion hints so the FIRST draft is more likely to succeed (cuts the retry rate). Tests (3 new in tests/test_agent_plan.py — all pass): - test_draft_plan_retries_on_hallucinated_skill_then_succeeds Reproduces the exact user case: file_read on attempt 1, file_ops on attempt 2, plan succeeds. - test_draft_plan_retry_also_fails_raises_with_both_attempts Both attempts hallucinate (file_read, then read_file): error message contains both for diagnostic value. - test_draft_plan_retry_qwen_unavailable_surfaces_original_error Retry call raises ConnectionError: original validation error surfaces with "retry failed" appended. All 3 existing draft_plan tests still pass — backward-compat preserved. The existing test_draft_plan_rejects_unknown_skill now exercises BOTH attempts (fake_qwen_chat returns same bad plan each time) and still raises with the missing skill in the message. Total: 35/35 file pass + 7 pre-existing pynput env failures (unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

AVADSA25 merged commit 028550d into main May 4, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hotfix: plan-time retry on hallucinated skill names#41

hotfix: plan-time retry on hallucinated skill names#41
AVADSA25 merged 1 commit intomainfrom
fix/plan-time-retry

AVADSA25 commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AVADSA25 commented May 4, 2026

Reproducer (real, today)

Why PR #35 didn't catch it

Fix (mirrors PR #35 one layer up)

Tests (3 new)

Test plan after merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants