Skip to content

feat(model): bump recommendation Opus 4.7 → 4.8 + v1.78.0 (closes #365)#371

Merged
BaseInfinity merged 1 commit into
mainfrom
bump/opus-4.8-recommendation
Jun 3, 2026
Merged

feat(model): bump recommendation Opus 4.7 → 4.8 + v1.78.0 (closes #365)#371
BaseInfinity merged 1 commit into
mainfrom
bump/opus-4.8-recommendation

Conversation

@BaseInfinity
Copy link
Copy Markdown
Owner

Summary

  • Bump wizard recommended model from Opus 4.7 → 4.8 across all surface files (closes Hold Opus 4.7 → 4.8 model-pin bump 48-72h for sentiment, then A/B via local-shepherd #365)
  • Bump min Claude Code version v2.1.111+ → v2.1.154+ (required for opus[1m] alias to resolve to 4.8)
  • Bump wizard version v1.77.0 → v1.78.0 across all 6 SDLC checklist locations + new CHANGELOG entry
  • opus[1m] alias auto-resolves to latest Opus, so settings.json template is unchanged — prose-and-version-bump only

Why now

  • Opus 4.8 launched 2026-05-28
  • Day 3-5 in-the-wild sentiment settled positive (vs. day-1 mixed reaction)
  • Proof-of-life via claude --print --model claude-opus-4-8 confirmed reachability on current CC v2.1.159

What changed

Layer Files
Wizard surface prose SDLC.md, CLAUDE_CODE_SDLC_WIZARD.md, skills/sdlc/SKILL.md, skills/setup/SKILL.md, skills/update/SKILL.md
Hook + JS hooks/model-effort-check.sh, cli/lib/repo-complexity.js
Docs CI_CD.md, tests/e2e/local-shepherd.sh, tests/test-hooks.sh, tests/test-repo-complexity.sh (comments)
Version package.json, .claude-plugin/{plugin,marketplace}.json, CHANGELOG.md

Effort semantics preserved. Strict effort behavior introduced in 4.7 carried forward to 4.8 — max remains the recommended default, xhigh the floor. References to "Opus 4.7+" with the plus sign are intentional historical anchors.

Un-run gates tracked as post-deploy obligations on #365

Gate Status
Gate 1: Sentiment ✅ Day 3-5 positive
Gate 2: A/B coder quality vs 4.7 on real PRs ⏳ Post-deploy
Gate 3: 24h dogfood ⏳ Post-deploy

Revert path: Since settings.json template is unchanged (alias does the resolution), revert is prose-only if 4.8 surfaces the system-card-flagged regressions (Gray Swan prompt-injection +60%, file-deletion tendency, eval-awareness affecting wizard output).

Test plan

  • tests/test-hooks.sh — 160/160
  • tests/test-self-update.sh — 153/153
  • tests/test-doc-consistency.sh — 41/41
  • tests/test-docs-usability.sh — 29/29 (validates new 1.78.0 anchor in skills/update/SKILL.md)
  • tests/test-audit-session-load.sh — 10/10 (skills/update/SKILL.md trimmed back under 5K cap after CHANGELOG entry addition)
  • tests/test-repo-complexity.sh — 11/11
  • CI validate green
  • Post-merge: spot-check /sdlc session start hook output mentions "Opus 4.8" not "4.7"

Opus 4.8 launched 2026-05-28. Sentiment day 3-5 settled positive;
proof-of-life via `claude --print --model claude-opus-4-8` confirmed
reachability. `opus[1m]` alias auto-resolves to latest Opus on
CC v2.1.154+, so settings.json template is unchanged — prose-and-
version-bump only.

Min CC v2.1.111+ → v2.1.154+ (required for opus[1m] to resolve to 4.8).

Updated prose across 7 wizard surface files; bumped version across all
6 SDLC checklist locations + CHANGELOG.

Un-run gates 2 (A/B coder quality) and 3 (24h dogfood) tracked as
post-deploy follow-up obligations on #365. If real-world use surfaces
the system-card-flagged regressions for 4.8 (prompt-injection +60% on
Gray Swan, file-deletion tendency, eval-awareness affecting wizard
output), revert via a single prose-only PR.

Tests: test-hooks 160/160, test-self-update 153/153, test-doc-consistency
41/41, test-docs-usability 29/29 (validates 1.78.0 anchor), test-audit-
session-load 10/10, test-repo-complexity 11/11.
@BaseInfinity BaseInfinity merged commit 088dc9c into main Jun 3, 2026
4 checks passed
@BaseInfinity BaseInfinity deleted the bump/opus-4.8-recommendation branch June 3, 2026 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hold Opus 4.7 → 4.8 model-pin bump 48-72h for sentiment, then A/B via local-shepherd

1 participant