Unlock model-driven question discovery in review and design skills#2
Open
alexei-led wants to merge 2 commits intovladikk:mainfrom
Open
Unlock model-driven question discovery in review and design skills#2alexei-led wants to merge 2 commits intovladikk:mainfrom
alexei-led wants to merge 2 commits intovladikk:mainfrom
Conversation
Replace rigid hard-coded questionnaires with dynamic gap discovery. The model reads code first, surfaces its understanding, then asks only questions whose answers would materially change the analysis.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the review and high-level-design skills to shift Step 1 from a fixed questionnaire toward model-driven gap discovery: the model reads requirements/code first, synthesizes its understanding with confidence, then asks only targeted follow-up questions that materially affect Balanced Coupling decisions.
Changes:
- Review skill Step 1 now: ask scope → read requirements + code → synthesize understanding + validate → ask targeted gap questions.
- High-level-design skill Step 1.2 now frames ambiguity discovery specifically around Balanced Coupling inputs (volatility/distance/strength) and applies a “materiality” filter to questions.
- Both skills add guidance to ground questions in concrete observations (code or requirements) rather than generic prompts.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| skills/review/SKILL.md | Replaces the fixed domain/teams/pain-points questionnaire with synthesis-first + targeted gap discovery. |
| skills/high-level-design/SKILL.md | Replaces generic ambiguity listing with coupling-aware gap discovery instructions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Question Flexibility
This PR changes how the review and high-level-design skills gather information from users. The short version: the model now reads the code first and figures out what to ask, instead of following a fixed questionnaire.
Why this matters
Your Balanced Coupling model is powerful — it needs three dimensions (strength, distance, volatility) to assess coupling properly. The model already has this framework loaded. But the current skills don't trust it to use that knowledge. Instead, they follow a rigid script:
These questions fire every time regardless of what the code or requirements already reveal. The model is perfectly capable of reading the code, applying the Balanced Coupling lens, and discovering what information it actually needs — we just weren't letting it.
What changed
Review skill — Step 1 rewritten
Before: Read requirements → Ask Scope → Read code → Ask Domain (fixed) → Ask Teams (fixed) → Ask Pain points (fixed)
After: Ask Scope → Read code + requirements → Surface understanding (model presents what it learned, user validates/corrects) → Discover gaps (model identifies what's missing for coupling assessment, asks targeted questions)
The model now:
High-level-design skill — Step 1 point 2 enhanced
Before: "List every ambiguity. Ask about each one." (generic, not coupling-aware)
After: Coupling-aware gap discovery — the model thinks about what the Balanced Coupling model needs (domain classification → volatility, organizational structure → distance, integration patterns → strength) and asks about gaps in those specific areas. Same materiality test: don't ask questions that wouldn't change the design.
The reasoning behind it
I did some research into how Opus 4.6 actually works (read the system card). A few things stood out:
Calibration: Opus 4.6 has state-of-the-art calibration — it knows when it doesn't know things (96.8% on false premise rejection, highest net scores on factual honesty). When we ask it to report confidence levels, it genuinely can identify where it's uncertain. That's exactly what we want for gap discovery.
"General instructions over prescriptive steps": Anthropic's own SWE-bench testing showed that telling the model to "explore the codebase and understand the root cause" produced better results than hand-written step-by-step plans. Our fixed questionnaire was a step-by-step plan. The new approach gives intent and constraints, then gets out of the way.
Agentic search: Opus 4.6 is state-of-the-art at multi-step information seeking (84% BrowseComp, 91.3% DeepSearchQA). It's naturally good at the "read context → identify gaps → decide what to explore next" pattern. We were under-utilizing this.
What stays the same
Example of the difference
Before:
After:
2 targeted questions instead of 4 scripted ones, better information because they're grounded in actual code.