feat(sdlc): /goal SDLC-discipline gates — 95% confidence floor + DLC binding (PR-D) by BaseInfinity · Pull Request #355 · BaseInfinity/claude-sdlc-wizard

BaseInfinity · 2026-05-25T00:17:03Z

Summary

Native `/goal` is now table-stakes across CC (v2.1.139+), Codex CLI, and likely others. Without SDLC discipline baked into the goal CONDITION itself, the Haiku evaluator rubber-stamps "did the agent flail for 20 turns" instead of "is the goal met correctly."

This PR extends the /goal wrapper in `/sdlc` skill with two new load-bearing gates.

The two new rules

1. Confidence gate — NEVER invoke `/goal` below HIGH 95%

Mirrors the existing Confidence Check rule. Below 95%, plan first, then `/goal`. The evaluator has no anchor for "is this correct"; only for "did the agent stop." Invoking at MEDIUM/LOW lets the evaluator certify wishful thinking.

2. DLC binding — condition MUST name the active DLC

`/sdlc` for code, `/gdlc` for games, `/ldlc` for legal, etc. Without the DLC name in the condition string, the evaluator anchors on completion only ("did the work end") instead of correctness ("did it follow the contract").

Example:

/goal "tests pass + clean tree following /sdlc, stop after 20 turns"

The "following /sdlc" clause makes the evaluator judge SDLC compliance per turn, not just task completion.

Why now

Native `/goal` shipped in CC v2.1.139 (~mid-May 2026), missed by wizard's autoupdate pipeline (the gap #350 fixes). Now that it's universal across agentic tools, anchoring it in SDLC discipline is the right wizard play — turns a "let it cook" primitive into a "let it cook properly" primitive.

What changed

`skills/sdlc/SKILL.md` — two new rules in the `## Long-Running Goals (/goal)` section
`tests/test-doc-consistency.sh::test_sdlc_skill_has_goal_wrapper` — extended grep to enforce both new keywords (`95%`/`confidence gate` + `DLC binding`/`name the DLC`)
Compensating trims to stay under roadmap(#217): loud WARNING below xhigh — max preferred, xhigh floor #236 5K-token cap: condensed /goal example, collapsed Multi-reviewer paragraph

Test plan

40/40 `test-doc-consistency.sh` (new test_sdlc_skill_has_goal_wrapper checks 9 keywords now: section-header, version-floor, trusted-workspace, disableAllHooks, hard-bound, anti-pattern, resume-caveat, 95%-confidence-gate, DLC-binding)
10/10 `test-audit-session-load.sh` (skill at 4989/5000 tokens — under cap)
CI `validate` green

…DLC binding (PR-D) Native /goal is now table-stakes across CC (v2.1.139+), Codex CLI, and likely others. Without SDLC discipline baked into the goal CONDITION itself, the Haiku evaluator rubber-stamps 'did the agent flail for 20 turns' instead of 'is the goal met correctly.' Two new load-bearing gates in skills/sdlc/SKILL.md ## Long-Running Goals section: 1. **Confidence gate — NEVER invoke below HIGH 95%.** Mirrors existing Confidence Check (plan first if below). Below 95% the evaluator has no anchor for 'is this correct'; only for 'did the agent stop.' 2. **DLC binding — condition MUST name the active DLC** (/sdlc for code, /gdlc for games, /ldlc for legal, etc.). Anchors the evaluator on 'doing it right,' not just 'doing it.' Example: /goal 'tests pass + clean tree following /sdlc, stop after 20 turns'. Quality test extended (tests/test-doc-consistency.sh::test_sdlc_skill_has_goal_wrapper): adds keyword greps for '95% confidence' / 'HIGH 95%' / 'confidence gate' AND 'DLC binding' / 'name the DLC' / 'condition MUST name'. Cross-cutting compensating trims to stay under #236 5K-token cap (skill at 4989/5000): - /goal section condensed example wording ('tests pass + clean tree' vs 'npm test=0 AND git clean') - Cross-Model Review's Multi-reviewer paragraph collapsed into one line 40/40 doc-consistency green, 10/10 audit green.

@latest

…) (#361) `npx agentic-sdlc-wizard init` (no @latest pin) silently serves whatever version is cached on disk, sometimes months old. The /update Step 1.5 pattern already catches this — but only at /update time, after the user has already installed stale templates. This moves the same check to init time so the gap surfaces immediately: - cli/init.js: new maybeEmitStaleCliNudge() called before planOperations. Reuses SDLC_WIZARD_CACHE_DIR/latest-version (24h TTL, #239 poison check), falls back to `npm view` with 5s timeout, silent on offline/error. - README.md + CLAUDE_CODE_SDLC_WIZARD.md: install command recommends `npx -y agentic-sdlc-wizard@latest init` so first-time users skip the cache trap by default. - tests/test-cli.sh: 3 quality tests covering nudge-fires-when-stale, silent-when-current, no-reverse-nudge-when-poisoned. Bundled: ROADMAP #347 + #350 status flip from "Actionable now" to DONE with PR citations (#351/#355/#354). Both shipped this week — paperwork was stale. Verified locally: test-cli.sh 91/91, test-doc-consistency 40/40, test-docs-usability 29/29, test-update-skill-cli-version 8/8, test-hooks 156/156, test-setup-path 83/83, test-self-update 153/153.

(#363) PR #361 shipped 3 #358 tests where Test C was silent-when-cache-poisoned rather than the goal-named silent-when-offline. The /goal Haiku evaluator counted tests by file presence and approved completion. Caught only at post-merge self-review, fixed in PR #362 (added the real offline test via fake-npm PATH override). Promoted to SDLC.md ## Lessons Learned → ### Testing so future sessions inherit the rule: when a /goal condition enumerates test cases by name, self-review must verify each named test exists by reading assertions, not by counting matching `test_` functions. This is the per-test-fidelity layer beneath PR #355's HIGH-95%-confidence and DLC-binding gates — those work at the macro level, but enumerated- condition fidelity is the author's responsibility. Verified locally: test-doc-consistency 40/40, test-postmortem-lessons 7/7, test-memory-audit-protocol 12/12.

PR #355 (v1.77.0) added skill-text guidance that `/goal` must state HIGH 95% confidence and bind to a DLC (`/sdlc`, `/gdlc`, `/ldlc`) before firing. The guidance worked at the meta level but had no runtime enforcement — exactly the failure mode the discipline gate documented (text-only guidance doesn't survive the failure mode it's trying to prevent). This adds `hooks/goal-confidence-check.sh` as a UserPromptSubmit hook that: 1. Matches `/goal <condition>` prompts (silent on `/goal` status + `/goal clear`). 2. Reads `transcript_path` from hook input (verified available on UserPromptSubmit per Anthropic hook docs at code.claude.com/docs/en/hooks), walks the JSONL to find the last assistant text message, and scans for HIGH-95% confidence patterns (`HIGH (95%`, `Confidence: HIGH`, `HIGH 95%`, etc.). 3. Greps the goal condition for a DLC binding (`/[a-z]+dlc`). 4. Emits LOUD warnings on either gap — non-blocking soft nudge (exit 0), same pattern as `model-effort-check.sh`. Registered in both channels: `cli/templates/settings.json` (npm/CLI) and `hooks/hooks.json` (plugin via ${CLAUDE_PLUGIN_ROOT}). Dedupes via `_find-sdlc-root.sh` helper so dual installs don't double-fire. Bundled paperwork: - `ROADMAP.md` adds the missing `## Research Parking Lot` section that the Demand-Signal-First entry gate references but was never created. Includes the maintenance rule (prune expired rows during quarterly triage). - `skills/sdlc/SKILL.md` adds a one-line enforcement cross-reference in the Long-Running Goals section. Trimmed adjacent prose to stay under the 5K token cap (#236). #359 triage (companion close): Of the 10 API changelog entries since 2026-04-16: 8 are clearly no-op for the wizard (MCP tunnels, SEC web search data, cache diagnostics, fast mode, AWS hosting, multiagent sessions, rate limits API, managed-agent memory — all orthogonal to SDLC enforcement). 2 are deprecations — verified zero usage in this repo: `context-1m-2025-08-07` (Sonnet 4.5/4 1M beta retired 2026-04-30; we use the separate `opus[1m]` alias) and `claude-3-haiku- 20240307` (Haiku 3 retired 2026-04-20; we never used it). #359 will be closed with the triage comment after merge. Verified locally: test-hooks 160/160 (4 new #360 tests + 156 existing), test-audit-session-load 10/10, test-cli 91/91, test-plugin all green, test-doc-consistency 41/41, test-self-update 153/153, test-setup-path 83/83.

BaseInfinity force-pushed the feat/goal-sdlc-discipline branch from 7ac9990 to f001405 Compare May 25, 2026 00:28

BaseInfinity enabled auto-merge (squash) May 25, 2026 00:28

BaseInfinity merged commit 2e65da7 into main May 25, 2026
4 checks passed

BaseInfinity mentioned this pull request May 26, 2026

docs(sdlc): promote /goal enumerated-test-fidelity lesson from #361/#362 #363

Merged

5 tasks

This was referenced May 28, 2026

Hold Opus 4.7 → 4.8 model-pin bump 48-72h for sentiment, then A/B via local-shepherd #365

Closed

feat(hooks): /goal SDLC discipline gate (closes #360, refs #359) #369

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(sdlc): /goal SDLC-discipline gates — 95% confidence floor + DLC binding (PR-D)#355

feat(sdlc): /goal SDLC-discipline gates — 95% confidence floor + DLC binding (PR-D)#355
BaseInfinity merged 1 commit into
mainfrom
feat/goal-sdlc-discipline

BaseInfinity commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BaseInfinity commented May 25, 2026

Summary

The two new rules

1. Confidence gate — NEVER invoke `/goal` below HIGH 95%

2. DLC binding — condition MUST name the active DLC

Why now

What changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant