hotfix: plan-time retry on hallucinated skill names#41
Merged
Conversation
User repro 2026-05-04 09:58:
Project: "Read all markdown files in ~/codec-repo/docs/ and create
an index.md that lists each file with its first heading and
a one-line description"
Result: Plan failed: plan invalid: plan references unknown skills:
['file_read']
Same hallucination CLASS as PR #35 but at a different LAYER.
PR #35 fixed retries during execution (codec_agent_runner). This is
failing earlier — at plan validation, before the plan is even saved.
The user never even got to the approve-or-reject step.
Root cause:
Qwen drafts plans naming skills that don't exist. `file_read` and
`fetch_url` are the two we've seen. The actual file-reading skill is
`file_ops` (which reads, writes, appends, lists). The actual URL fetch
is `web_fetch`. The user-visible result was the same as PR #35 —
project mode dies before doing anything useful.
Fix (mirrors PR #35's pattern at planning layer):
1. After validate_plan_skills returns ok=False, instead of raising,
build a corrective prompt listing the missing skills, the FULL
allowed registry, and the three most common confusions
(file_read→file_ops, fetch_url→web_fetch, read_file→file_ops).
2. Re-call _qwen_chat ONCE with the appended correction.
3. Re-validate the second draft. If valid, use it. If not, raise
with BOTH attempts in the message so the user sees the model is
consistently confused (vs a one-off transient miss).
4. If the retry call itself fails (Qwen flakes between attempts),
raise with the ORIGINAL validation error — more diagnostic than
"qwen flaked on retry".
Also:
- Strengthen _PLAN_SYSTEM_PROMPT with the same three confusion hints
so the FIRST draft is more likely to succeed (cuts the retry rate).
Tests (3 new in tests/test_agent_plan.py — all pass):
- test_draft_plan_retries_on_hallucinated_skill_then_succeeds
Reproduces the exact user case: file_read on attempt 1, file_ops
on attempt 2, plan succeeds.
- test_draft_plan_retry_also_fails_raises_with_both_attempts
Both attempts hallucinate (file_read, then read_file): error
message contains both for diagnostic value.
- test_draft_plan_retry_qwen_unavailable_surfaces_original_error
Retry call raises ConnectionError: original validation error
surfaces with "retry failed" appended.
All 3 existing draft_plan tests still pass — backward-compat preserved.
The existing test_draft_plan_rejects_unknown_skill now exercises BOTH
attempts (fake_qwen_chat returns same bad plan each time) and still
raises with the missing skill in the message.
Total: 35/35 file pass + 7 pre-existing pynput env failures (unchanged).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reproducer (real, today)
User dropped:
Result:
The user never even reached approve/reject. Plan validation rejected the draft because Qwen invented `file_read` (the actual skill is `file_ops`).
Why PR #35 didn't catch it
PR #35 added a single-shot correction-nudge retry inside `codec_agent_runner._execute_checkpoint` — that handles hallucinations at execution time (skill / write_path / read_path / domain). But validation lives earlier, in `codec_agent_plan.draft_plan` → `validate_plan_skills`. The plan is rejected before it's even saved, so the runner never gets a chance to retry.
Fix (mirrors PR #35 one layer up)
Also strengthen `_PLAN_SYSTEM_PROMPT` with the same three confusion hints so the FIRST draft is more likely to succeed (cuts the retry rate).
Tests (3 new)
All 3 existing `draft_plan` tests still pass — backward-compat preserved. `test_draft_plan_rejects_unknown_skill` now exercises BOTH attempts (fake returns same bad plan twice) and still raises with the missing skill in the message.
Tally: 35/35 file tests pass + 7 pre-existing pynput env failures (unchanged on baseline, unrelated to this PR).
Test plan after merge
🤖 Generated with Claude Code