Skip to content

Improve Claude Code skill descriptions and naming consistency#35881

Merged
bosconi merged 13 commits intomainfrom
jc/skill-linting
Apr 6, 2026
Merged

Improve Claude Code skill descriptions and naming consistency#35881
bosconi merged 13 commits intomainfrom
jc/skill-linting

Conversation

@bosconi
Copy link
Copy Markdown
Member

@bosconi bosconi commented Apr 6, 2026

Summary

Audit and improve the Claude Code skills in .claude/skills/ to trigger more
reliably and avoid confusing overlaps.

Commit-by-commit breakdown

  1. Rename skills to use consistent mz- prefixadapter-guide, debug-ci,
    limits-test, parallel-workload, platform-checks, and query-tracing are
    renamed to match the convention already used by mz-benchmark, mz-commit,
    mz-profile, mz-run, mz-test, and mz-pr-review. Updates name frontmatter
    fields and the trace_tree.py path reference.

  2. mz-pr-review: add missing name field and improve description — This skill
    was missing the name frontmatter field entirely, which could prevent correct
    identification. The description was too generic and missed casual triggers like
    "review my code" or "does this look ok".

  3. mz-debug-ci: expand description with casual trigger phrases — The old one-line
    description only matched formal phrases. Users more often say "why is CI red" or
    "checks failing".

  4. mz-test: add cross-references to specialized framework skills — Clarifies
    mz-test as the general testing guide and entry point for framework selection,
    pointing to mz-platform-checks, mz-parallel-workload, and mz-limits-test for deep
    framework-specific guidance.

  5. mz-adapter-guide: add problem-oriented trigger phrases — The old description
    only triggered on file paths and crate names. Adding triggers like "how does the
    coordinator work" helps the skill activate when someone is asking questions, not
    just editing files.

  6. mz-query-tracing: add problem-oriented trigger phrases — Reframed from
    mechanism-focused ("tracing, spans, Tempo") to problem-focused ("why is this query
    slow", "where is the time going").

  7. mz-benchmark: clarify distinction from mz-parallel-workload — Explicitly notes
    this is about performance measurement frameworks, not the parallel-workload
    stress-testing framework.

  8. mz-parallel-workload: clarify distinction from mz-benchmark — Mirror
    disambiguation: leads with "stress-testing for panics/errors" and points to
    mz-benchmark for performance measurement.

  9. Update skills README — Rewritten with clearer descriptions, new mz- names,
    a dedicated "Specialized Test Frameworks" section, and a note that the README is
    human documentation only (not used for skill triggering).

  10. Add skills section to CLAUDE.md — A brief nudge reminding Claude that
    project-specific mz-* skills exist and are worth consulting before starting a
    task. This is lightweight by design: the skill descriptions are already in context
    via SKILL.md frontmatter, so CLAUDE.md does not duplicate them.

  11. mz-test: fix cross-references to use mz- prefixed skill names — The
    cross-references said platform-checks, parallel-workload, and limits-test
    without the mz- prefix, so they did not match the actual skill names.

  12. mz-commit: remove "code review" trigger to avoid collision with mz-pr-review
    — Both skills listed "code review" as a trigger phrase, but mz-commit is for
    creating commits/PRs while mz-pr-review is for reviewing changes. Removed the
    ambiguous trigger from mz-commit and added a pointer to mz-pr-review.

What was deliberately not done

  • No exhaustive skill listing in CLAUDE.md. Duplicating all 12 skill descriptions
    would waste context tokens and create a maintenance burden. The brief mention is
    sufficient to nudge the model to check the skill list it already has.

  • No AGENTS.md file. AGENTS.md is a cross-tool convention (Gemini CLI, Codex,
    etc.) that Claude Code also reads. Since the team is using Claude Code and CLAUDE.md
    is already established, adding AGENTS.md now would be premature. If other AI tools
    are adopted later, a symlink AGENTS.md -> CLAUDE.md is the right move — the
    skills section would be harmless noise for tools that do not support skills.

Test plan

  • Verify all 12 skills appear in the skill list at session start (all mz-* prefix)
  • Spot-check triggering: "why is CI red" should suggest mz-debug-ci
  • Spot-check triggering: "review my changes" should suggest mz-pr-review
  • Spot-check triggering: "code review" should suggest mz-pr-review, not mz-commit
  • Verify /mz-query-tracing works and references the correct trace_tree.py path

🤖 Generated with Claude Code

bosconi and others added 10 commits April 6, 2026 10:25
Renames adapter-guide, debug-ci, limits-test, parallel-workload,
platform-checks, and query-tracing to use the mz- prefix, matching the
convention already used by mz-benchmark, mz-commit, mz-profile, mz-run,
mz-test, and mz-pr-review. Updates the name field in each SKILL.md
frontmatter and fixes the trace_tree.py path reference in mz-query-tracing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This skill was missing the `name` frontmatter field entirely, which could
prevent Claude Code from correctly identifying it. The description was also
too generic ("when the user asks for a review") and missed common casual
triggers like "review my code", "check my diff", or "does this look ok".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old one-line description only matched formal phrases like "Buildkite
failures". Users more often say things like "why is CI red", "build broken",
or "checks failing" — or just paste a Buildkite URL. The expanded description
covers these patterns so the skill triggers when it should.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mz-test is the general testing guide, but mz-platform-checks,
mz-parallel-workload, and mz-limits-test provide deeper guidance for their
specific frameworks. Without cross-references, both the general and specific
skill could trigger redundantly. The updated description clarifies mz-test
as the starting point for framework selection and points to the dedicated
skills for deep framework usage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old description only triggered on file paths and crate names, missing
users who ask questions like "how does the coordinator work" or "what are
read holds". Adding problem-oriented triggers helps the skill activate when
someone is trying to understand the subsystem, not just editing it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The old description was focused on the mechanism (tracing, spans, Tempo)
rather than the problem users are trying to solve. Adding triggers like
"why is this query slow" and "where is the time going" helps the skill
activate when users describe symptoms, not just when they already know they
want tracing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mz-benchmark includes a "Parallel Benchmark" framework, and
mz-parallel-workload is a separate stress-testing framework. The similar
names could confuse the model. The updated description explicitly notes
that mz-benchmark is about performance measurement, not the
parallel-workload stress-testing framework.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The description now leads with what the framework does (stress-testing for
panics/errors) and explicitly notes it is not for performance measurement,
pointing to mz-benchmark for that. This disambiguates the two skills which
have confusingly similar names.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite the README to use the new mz- prefixed names, add a note that this
file is human documentation only (not used for skill triggering), reorganize
test framework skills into their own section with guidance to start from
mz-test, and use clearer "When to use" descriptions throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CLAUDE.md is always loaded into context, so a brief mention of the mz-*
skills helps Claude check for relevant skills before starting a task. This
is a lightweight nudge rather than a full listing — the skill descriptions
themselves are already in context via SKILL.md frontmatter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

The description referenced platform-checks, parallel-workload, and
limits-test without the mz- prefix. These need to match the actual skill
names so the model can find them in its skill list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bosconi and others added 2 commits April 6, 2026 11:30
…-review

Both mz-commit and mz-pr-review listed "code review" as a trigger phrase,
but they serve different purposes: mz-commit is for creating commits and PRs,
mz-pr-review is for reviewing changes. A user saying "code review" almost
certainly wants the review skill, not the commit skill. Added a pointer to
mz-pr-review to make the boundary explicit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep the skills nudge generic so it does not go stale when skills are
added or renamed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bosconi bosconi merged commit 95e620a into main Apr 6, 2026
6 checks passed
@bosconi bosconi deleted the jc/skill-linting branch April 6, 2026 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants