Improve Claude Code skill descriptions and naming consistency#35881
Merged
Improve Claude Code skill descriptions and naming consistency#35881
Conversation
Renames adapter-guide, debug-ci, limits-test, parallel-workload, platform-checks, and query-tracing to use the mz- prefix, matching the convention already used by mz-benchmark, mz-commit, mz-profile, mz-run, mz-test, and mz-pr-review. Updates the name field in each SKILL.md frontmatter and fixes the trace_tree.py path reference in mz-query-tracing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This skill was missing the `name` frontmatter field entirely, which could
prevent Claude Code from correctly identifying it. The description was also
too generic ("when the user asks for a review") and missed common casual
triggers like "review my code", "check my diff", or "does this look ok".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old one-line description only matched formal phrases like "Buildkite failures". Users more often say things like "why is CI red", "build broken", or "checks failing" — or just paste a Buildkite URL. The expanded description covers these patterns so the skill triggers when it should. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mz-test is the general testing guide, but mz-platform-checks, mz-parallel-workload, and mz-limits-test provide deeper guidance for their specific frameworks. Without cross-references, both the general and specific skill could trigger redundantly. The updated description clarifies mz-test as the starting point for framework selection and points to the dedicated skills for deep framework usage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old description only triggered on file paths and crate names, missing users who ask questions like "how does the coordinator work" or "what are read holds". Adding problem-oriented triggers helps the skill activate when someone is trying to understand the subsystem, not just editing it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old description was focused on the mechanism (tracing, spans, Tempo) rather than the problem users are trying to solve. Adding triggers like "why is this query slow" and "where is the time going" helps the skill activate when users describe symptoms, not just when they already know they want tracing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mz-benchmark includes a "Parallel Benchmark" framework, and mz-parallel-workload is a separate stress-testing framework. The similar names could confuse the model. The updated description explicitly notes that mz-benchmark is about performance measurement, not the parallel-workload stress-testing framework. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The description now leads with what the framework does (stress-testing for panics/errors) and explicitly notes it is not for performance measurement, pointing to mz-benchmark for that. This disambiguates the two skills which have confusingly similar names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite the README to use the new mz- prefixed names, add a note that this file is human documentation only (not used for skill triggering), reorganize test framework skills into their own section with guidance to start from mz-test, and use clearer "When to use" descriptions throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLAUDE.md is always loaded into context, so a brief mention of the mz-* skills helps Claude check for relevant skills before starting a task. This is a lightweight nudge rather than a full listing — the skill descriptions themselves are already in context via SKILL.md frontmatter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
The description referenced platform-checks, parallel-workload, and limits-test without the mz- prefix. These need to match the actual skill names so the model can find them in its skill list. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-review Both mz-commit and mz-pr-review listed "code review" as a trigger phrase, but they serve different purposes: mz-commit is for creating commits and PRs, mz-pr-review is for reviewing changes. A user saying "code review" almost certainly wants the review skill, not the commit skill. Added a pointer to mz-pr-review to make the boundary explicit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep the skills nudge generic so it does not go stale when skills are added or renamed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
antiguru
approved these changes
Apr 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Audit and improve the Claude Code skills in
.claude/skills/to trigger morereliably and avoid confusing overlaps.
Commit-by-commit breakdown
Rename skills to use consistent
mz-prefix —adapter-guide,debug-ci,limits-test,parallel-workload,platform-checks, andquery-tracingarerenamed to match the convention already used by
mz-benchmark,mz-commit,mz-profile,mz-run,mz-test, andmz-pr-review. Updatesnamefrontmatterfields and the
trace_tree.pypath reference.mz-pr-review: add missing
namefield and improve description — This skillwas missing the
namefrontmatter field entirely, which could prevent correctidentification. The description was too generic and missed casual triggers like
"review my code" or "does this look ok".
mz-debug-ci: expand description with casual trigger phrases — The old one-line
description only matched formal phrases. Users more often say "why is CI red" or
"checks failing".
mz-test: add cross-references to specialized framework skills — Clarifies
mz-test as the general testing guide and entry point for framework selection,
pointing to mz-platform-checks, mz-parallel-workload, and mz-limits-test for deep
framework-specific guidance.
mz-adapter-guide: add problem-oriented trigger phrases — The old description
only triggered on file paths and crate names. Adding triggers like "how does the
coordinator work" helps the skill activate when someone is asking questions, not
just editing files.
mz-query-tracing: add problem-oriented trigger phrases — Reframed from
mechanism-focused ("tracing, spans, Tempo") to problem-focused ("why is this query
slow", "where is the time going").
mz-benchmark: clarify distinction from mz-parallel-workload — Explicitly notes
this is about performance measurement frameworks, not the parallel-workload
stress-testing framework.
mz-parallel-workload: clarify distinction from mz-benchmark — Mirror
disambiguation: leads with "stress-testing for panics/errors" and points to
mz-benchmark for performance measurement.
Update skills README — Rewritten with clearer descriptions, new
mz-names,a dedicated "Specialized Test Frameworks" section, and a note that the README is
human documentation only (not used for skill triggering).
Add skills section to CLAUDE.md — A brief nudge reminding Claude that
project-specific
mz-*skills exist and are worth consulting before starting atask. This is lightweight by design: the skill descriptions are already in context
via SKILL.md frontmatter, so CLAUDE.md does not duplicate them.
mz-test: fix cross-references to use mz- prefixed skill names — The
cross-references said
platform-checks,parallel-workload, andlimits-testwithout the
mz-prefix, so they did not match the actual skill names.mz-commit: remove "code review" trigger to avoid collision with mz-pr-review
— Both skills listed "code review" as a trigger phrase, but mz-commit is for
creating commits/PRs while mz-pr-review is for reviewing changes. Removed the
ambiguous trigger from mz-commit and added a pointer to mz-pr-review.
What was deliberately not done
No exhaustive skill listing in CLAUDE.md. Duplicating all 12 skill descriptions
would waste context tokens and create a maintenance burden. The brief mention is
sufficient to nudge the model to check the skill list it already has.
No AGENTS.md file. AGENTS.md is a cross-tool convention (Gemini CLI, Codex,
etc.) that Claude Code also reads. Since the team is using Claude Code and CLAUDE.md
is already established, adding AGENTS.md now would be premature. If other AI tools
are adopted later, a symlink
AGENTS.md -> CLAUDE.mdis the right move — theskills section would be harmless noise for tools that do not support skills.
Test plan
mz-*prefix)/mz-query-tracingworks and references the correcttrace_tree.pypath🤖 Generated with Claude Code