MaterializeInc · bosconi · Apr 6, 2026 · Apr 6, 2026 · Apr 6, 2026 · Apr 6, 2026
diff --git a/.claude/skills/README.md b/.claude/skills/README.md
@@ -1,37 +1,44 @@
 # Claude Code Skills for Materialize
 
 This directory contains skills that teach Claude Code how to work effectively in
-the Materialize codebase. Skills are triggered automatically based on context, or
-can be invoked explicitly with `/<skill-name>`.
+the Materialize codebase. Skills are triggered automatically based on their
+SKILL.md description, or can be invoked explicitly with `/<skill-name>`.
+
+> **Note:** This README is for humans browsing the repo. Claude Code does not
+> read this file when deciding which skills to trigger -- it uses the `name` and
+> `description` fields in each skill's SKILL.md frontmatter.
 
 ## Development Workflow
 
-| Skill | Trigger | What it does |
+| Skill | When to use | What it does |
 |---|---|---|
-| **mz-run** | Compiling, running, formatting, linting | How to build and run Materialize locally, including `cargo`, `bin/fmt`, `bin/lint`, log filters, and jemalloc setup |
-| **mz-test** | Running or writing tests | Guides for every test framework: unit tests, sqllogictest, testdrive, pgtest, and mzcompose |
-| **mz-commit** | Committing, creating PRs | Pre-commit checklist, PR title conventions, and git workflows |
-| **mz-pr-review** | Reviewing code | Local code review against Materialize standards for tests, style, architecture, and polish |
-| **debug-ci** | CI failures on a PR | Investigates Buildkite failures using `gh` and `bk` CLI tools to identify root causes |
+| **mz-run** | Compiling, running locally, formatting, linting | Build and run Materialize locally (`cargo check`, `bin/environmentd`, `bin/fmt`, `bin/lint`, log filters, jemalloc) |
+| **mz-test** | Running or writing tests, choosing a test framework | General testing guide: unit tests, sqllogictest, testdrive, pgtest, mzcompose, reproducing bugs. Points to dedicated skills for specific frameworks |
+| **mz-commit** | Committing, creating PRs, pushing | Pre-commit checklist, PR title conventions, Cargo.lock discipline, git workflows |
+| **mz-pr-review** | Reviewing code changes | Local code review against Materialize standards for tests, style, error messages, architecture, and polish |
+| **mz-debug-ci** | CI failures, red builds, Buildkite issues | Investigates Buildkite failures using `gh` and `bk` CLI tools to identify root causes |
 
 ## Performance
 
-| Skill | Trigger | What it does |
+| Skill | When to use | What it does |
 |---|---|---|
-| **mz-benchmark** | Adding or debugging benchmarks | Three frameworks: Feature Benchmark (micro-benchmarks), Scalability Test (throughput under concurrency), and Parallel Benchmark (sustained performance) |
-| **mz-profile** | Profiling, slow queries, high memory | CPU profiling with samply, memory profiling with heaptrack, and binary size analysis |
-| **query-tracing** | Tracing query execution time | Distributed tracing with OpenTelemetry/Tempo to understand where time goes in SQL statement execution |
+| **mz-benchmark** | Adding or debugging benchmark scenarios | Three measurement frameworks: Feature Benchmark (micro-benchmarks), Scalability Test (throughput under concurrency), Parallel Benchmark (sustained latency) |
+| **mz-profile** | Something is slow or using too much memory | CPU profiling with samply, memory profiling with heaptrack, binary size analysis |
+| **mz-query-tracing** | Understanding where time goes in SQL execution | Distributed tracing with OpenTelemetry/Tempo to get a latency breakdown of any SQL statement |
+
+## Specialized Test Frameworks
 
-## Test Frameworks
+These skills provide deep guidance for specific test frameworks. For help
+choosing which framework to use, start with **mz-test**.
 
-| Skill | Trigger | What it does |
+| Skill | When to use | What it does |
 |---|---|---|
-| **platform-checks** | Writing upgrade/restart checks | "Write once, run everywhere" framework for testing feature survival across restarts and upgrades |
-| **parallel-workload** | Concurrent stress testing | Runs random SQL actions concurrently to catch panics and unexpected errors |
-| **limits-test** | Stress-testing with many objects | Catches regressions (panics, stack overflows, OOMs) when scaling to large numbers of tables, views, sources, etc. |
+| **mz-platform-checks** | Testing feature survival across restarts/upgrades | Write Check classes with initialize/manipulate/validate phases that run across upgrade and restart scenarios |
+| **mz-parallel-workload** | Testing for panics under concurrent SQL | Extend the stress-testing framework that runs random SQL actions concurrently to catch panics and unexpected errors |
+| **mz-limits-test** | Stress-testing with large numbers of objects | Add Generator subclasses that scale tables, views, sources, etc. to catch panics, stack overflows, and OOMs |
 
 ## Architecture
 
-| Skill | Trigger | What it does |
+| Skill | When to use | What it does |
 |---|---|---|
-| **adapter-guide** | Working on adapter, coordinator, pgwire | Correctness invariants and architectural notes for the adapter layer, timestamp oracle, peek paths, and related crates |
+| **mz-adapter-guide** | Working on or asking about the adapter layer | Correctness invariants and architectural notes for the coordinator, pgwire, peek paths, timestamp oracle, and related crates |
diff --git a/.claude/skills/adapter-guide/SKILL.md → .claude/skills/mz-adapter-guide/SKILL.md b/.claude/skills/adapter-guide/SKILL.md → .claude/skills/mz-adapter-guide/SKILL.md
@@ -1,12 +1,12 @@
 ---
-name: adapter-guide
+name: mz-adapter-guide
 description: >
-  This skill should be used when the user works on the adapter layer,
-  coordinator, pgwire, frontend peek path, peek client, timestamp oracle,
-  batching oracle, or related crates. Trigger when the user mentions or edits
-  files in src/adapter/, src/pgwire/, src/timestamp-oracle/, or mentions
-  frontend, coordinator, read holds, timestamp selection, timestamp oracle,
-  batching oracle, or query sequencing.
+  Correctness invariants and architectural guidance for the adapter layer,
+  coordinator, pgwire, peek paths, and timestamp oracle. Trigger when the user
+  works on or asks questions about these subsystems — including "how does the
+  coordinator work", "what are read holds", "explain the peek path", "how does
+  timestamp selection work", "why does this query block". Also trigger when
+  editing files in src/adapter/, src/pgwire/, or src/timestamp-oracle/.
 ---
 
 # Adapter Guide Skill

diff --git a/.claude/skills/mz-benchmark/SKILL.md b/.claude/skills/mz-benchmark/SKILL.md
@@ -1,14 +1,16 @@
 ---
 name: mz-benchmark
 description: >
-  This skill should be used when the user wants to add, modify, or debug a
-  benchmark scenario for performance testing. Trigger when the user mentions
+  Add, modify, or debug benchmark scenarios for measuring Materialize
+  performance. Covers three frameworks: Feature Benchmark (single-operation
+  micro-benchmarks), Scalability Test (SQL throughput under concurrency), and
+  Parallel Benchmark (sustained latency over time via scenarios.py). Trigger on
   "benchmark", "feature benchmark", "scalability test", "parallel benchmark",
-  "performance regression", "micro-benchmark", "TPS", "latency test", or wants
-  to measure how fast a SQL operation runs. Also trigger when the user edits
-  files in misc/python/materialize/feature_benchmark/scenarios/,
-  misc/python/materialize/scalability/workload/workloads/, or
-  misc/python/materialize/parallel_benchmark/scenarios.py.
+  "performance regression", "micro-benchmark", "TPS", "latency test", or when
+  editing files in feature_benchmark/scenarios/, scalability/workload/workloads/,
+  or parallel_benchmark/scenarios.py. Note: this is about benchmark measurement
+  frameworks, not the parallel-workload stress-testing framework (which tests for
+  panics under concurrency, not performance).
 ---
 
 # Benchmark Frameworks

diff --git a/.claude/skills/mz-commit/SKILL.md b/.claude/skills/mz-commit/SKILL.md
@@ -3,9 +3,9 @@ name: mz-commit
 description: >
   This skill should be used when the user wants to "commit", "prepare a commit",
   "create a PR", "push", "open a pull request", or mentions committing,
-  pre-commit checks, pull requests, or code review in the Materialize
-  repository. Use this skill even if the user just says "ship it" or
-  "ready to merge" without being specific.
+  pre-commit checks, or pull requests in the Materialize repository. Use this
+  skill even if the user just says "ship it" or "ready to merge" without being
+  specific. Note: for reviewing code, use mz-pr-review instead.
 ---
 
 # Committing in Materialize

diff --git a/.claude/skills/debug-ci/SKILL.md → .claude/skills/mz-debug-ci/SKILL.md b/.claude/skills/debug-ci/SKILL.md → .claude/skills/mz-debug-ci/SKILL.md
@@ -1,6 +1,12 @@
 ---
-name: debug-ci
-description: Investigate CI failures on a PR using gh and bk CLI tools. Triggers when asked about failing checks, Buildkite failures, or CI issues on a PR.
+name: mz-debug-ci
+description: >
+  Investigate CI failures on a PR using gh and bk CLI tools. Trigger when the
+  user asks about failing checks, Buildkite failures, or CI issues — including
+  casual phrases like "why is CI red", "build broken", "checks failing", "what
+  went wrong in CI", "nightly broke", "tests failing on this PR", or pastes a
+  Buildkite URL. Also trigger when the user mentions a specific PR number and
+  wants to understand why it's failing.
 argument-hint: <PR number or GitHub PR URL>
 ---
 

diff --git a/.claude/skills/limits-test/SKILL.md → .claude/skills/mz-limits-test/SKILL.md b/.claude/skills/limits-test/SKILL.md → .claude/skills/mz-limits-test/SKILL.md
@@ -1,5 +1,5 @@
 ---
-name: limits-test
+name: mz-limits-test
 description: >
   This skill should be used when the user wants to add, modify, or debug a limits
   test. Trigger when the user mentions "limits test", "Generator subclass",

diff --git a/.claude/skills/parallel-workload/SKILL.md → .claude/skills/mz-parallel-workload/SKILL.md b/.claude/skills/parallel-workload/SKILL.md → .claude/skills/mz-parallel-workload/SKILL.md
@@ -1,13 +1,13 @@
 ---
-name: parallel-workload
+name: mz-parallel-workload
 description: >
-  This skill should be used when the user wants to extend the parallel-workload
-  test framework, add a new action, modify existing actions, or add SQL coverage
-  to parallel workload. Trigger when the user mentions "parallel workload",
-  "parallel-workload", "action.py" in the context of parallel workload,
-  or wants to test for panics or unexpected query errors under concurrency.
-  Use this skill even if the user just says "add this to parallel workload" or
-  references a bug that panics under concurrent DDL/DML.
+  Extend the parallel-workload stress-testing framework, which runs random SQL
+  actions concurrently to catch panics and unexpected errors (not performance
+  measurement — see mz-benchmark for that). Trigger when the user mentions
+  "parallel workload", "parallel-workload", "action.py" in the context of
+  parallel workload, or wants to test for panics or unexpected query errors under
+  concurrency. Use this skill even if the user just says "add this to parallel
+  workload" or references a bug that panics under concurrent DDL/DML.
 ---
 
 # Extending Parallel Workload

diff --git a/.claude/skills/platform-checks/SKILL.md → .claude/skills/mz-platform-checks/SKILL.md b/.claude/skills/platform-checks/SKILL.md → .claude/skills/mz-platform-checks/SKILL.md
@@ -1,5 +1,5 @@
 ---
-name: platform-checks
+name: mz-platform-checks
 description: >
   This skill should be used when the user wants to create, modify, or debug a
   platform check. Trigger when the user mentions "platform check", "platform-checks",

diff --git a/.claude/skills/mz-pr-review/SKILL.md b/.claude/skills/mz-pr-review/SKILL.md
@@ -1,7 +1,12 @@
 ---
-description:  >
-Local PR review against Materialize standards.
-This skill should be used when the user asks for a review of changes.
+name: mz-pr-review
+description: >
+  Perform a local code review of the current branch's changes against Materialize
+  project standards. Trigger when the user says "review my code", "review my
+  changes", "check my diff", "does this look ok", "what do you think of this
+  PR", "code review", or asks you to look over changes before merging. Also
+  trigger when the user passes a PR number and wants feedback on quality, style,
+  or correctness.
 argument-hint: [base-branch]
 allowed-tools: [Bash, Read, Grep, Glob, Task]
 ---

diff --git a/.claude/skills/query-tracing/SKILL.md → .claude/skills/mz-query-tracing/SKILL.md b/.claude/skills/query-tracing/SKILL.md → .claude/skills/mz-query-tracing/SKILL.md
@@ -1,11 +1,11 @@
 ---
-name: query-tracing
+name: mz-query-tracing
 description: >
-  This skill should be used when the user wants to debug or trace where time is
-  spent during query execution in Materialize. Trigger when the user mentions
-  tracing queries, span analysis, query latency breakdown, Tempo traces,
-  trace IDs, or wants to understand where time goes in CREATE TABLE, SELECT,
-  INSERT, DROP, or any other SQL statement execution.
+  Debug where time is spent during SQL execution using distributed tracing
+  (OpenTelemetry / Tempo). Trigger when the user asks "why is this query slow",
+  "where is the time going", "this SELECT takes forever", or wants a latency
+  breakdown for any SQL statement. Also trigger on mentions of tracing queries,
+  span analysis, Tempo traces, trace IDs, or opentelemetry_filter.
 ---
 
 # Query Tracing Skill
@@ -164,7 +164,7 @@ parent span IDs are base64-encoded, not hex. The analysis script handles this.
 Use the `trace_tree.py` script in this skill directory to analyze traces:
 
 ```bash
-python3 .claude/skills/query-tracing/trace_tree.py /tmp/claude-1000/trace.json "My Query"
+python3 .claude/skills/mz-query-tracing/trace_tree.py /tmp/claude-1000/trace.json "My Query"
 ```
 
 ## Interpreting Results

diff --git a/.claude/skills/query-tracing/trace_tree.py → ...ude/skills/mz-query-tracing/trace_tree.py b/.claude/skills/query-tracing/trace_tree.py → ...ude/skills/mz-query-tracing/trace_tree.py
diff --git a/.claude/skills/mz-test/SKILL.md b/.claude/skills/mz-test/SKILL.md
@@ -1,13 +1,16 @@
 ---
 name: mz-test
 description: >
-  This skill should be used when the user wants to "run tests", "run testdrive",
+  General guide for running tests and choosing the right test framework in
+  Materialize. Trigger when the user wants to "run tests", "run testdrive",
   "run sqllogictest", "run mzcompose", "run cargo test", "run pgtest",
   "rewrite test results", "add a test", "reproduce a bug", "write a regression
   test", or mentions testing, testdrive, sqllogictest, mzcompose, pgtest,
-  cargo test, nextest, flaky tests, or test failures in the Materialize
-  repository. Use this skill even if the user just says "test this" or
-  "how do I verify this works" without naming a specific framework.
+  cargo test, nextest, flaky tests, or test failures. Use this skill even if the
+  user just says "test this" or "how do I verify this works" without naming a
+  specific framework. For deep guidance on specific frameworks, see the dedicated
+  skills: mz-platform-checks (upgrade/restart survival), mz-parallel-workload
+  (concurrent stress testing), and mz-limits-test (scaling to many objects).
 ---
 
 # Testing Materialize

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,5 +1,11 @@
 # Materialize
 
+## Skills
+
+This repo has Materialize-specific skills in `.claude/skills/`. Before starting
+a task, check if a relevant `mz-*` skill exists — they encode project-specific
+conventions and save significant time.
+
 ## Code navigation
 
 When tracing how an operation flows through the codebase, read these files first: