Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 26 additions & 19 deletions .claude/skills/README.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,44 @@
# Claude Code Skills for Materialize

This directory contains skills that teach Claude Code how to work effectively in
the Materialize codebase. Skills are triggered automatically based on context, or
can be invoked explicitly with `/<skill-name>`.
the Materialize codebase. Skills are triggered automatically based on their
SKILL.md description, or can be invoked explicitly with `/<skill-name>`.

> **Note:** This README is for humans browsing the repo. Claude Code does not
> read this file when deciding which skills to trigger -- it uses the `name` and
> `description` fields in each skill's SKILL.md frontmatter.

## Development Workflow

| Skill | Trigger | What it does |
| Skill | When to use | What it does |
|---|---|---|
| **mz-run** | Compiling, running, formatting, linting | How to build and run Materialize locally, including `cargo`, `bin/fmt`, `bin/lint`, log filters, and jemalloc setup |
| **mz-test** | Running or writing tests | Guides for every test framework: unit tests, sqllogictest, testdrive, pgtest, and mzcompose |
| **mz-commit** | Committing, creating PRs | Pre-commit checklist, PR title conventions, and git workflows |
| **mz-pr-review** | Reviewing code | Local code review against Materialize standards for tests, style, architecture, and polish |
| **debug-ci** | CI failures on a PR | Investigates Buildkite failures using `gh` and `bk` CLI tools to identify root causes |
| **mz-run** | Compiling, running locally, formatting, linting | Build and run Materialize locally (`cargo check`, `bin/environmentd`, `bin/fmt`, `bin/lint`, log filters, jemalloc) |
| **mz-test** | Running or writing tests, choosing a test framework | General testing guide: unit tests, sqllogictest, testdrive, pgtest, mzcompose, reproducing bugs. Points to dedicated skills for specific frameworks |
| **mz-commit** | Committing, creating PRs, pushing | Pre-commit checklist, PR title conventions, Cargo.lock discipline, git workflows |
| **mz-pr-review** | Reviewing code changes | Local code review against Materialize standards for tests, style, error messages, architecture, and polish |
| **mz-debug-ci** | CI failures, red builds, Buildkite issues | Investigates Buildkite failures using `gh` and `bk` CLI tools to identify root causes |

## Performance

| Skill | Trigger | What it does |
| Skill | When to use | What it does |
|---|---|---|
| **mz-benchmark** | Adding or debugging benchmarks | Three frameworks: Feature Benchmark (micro-benchmarks), Scalability Test (throughput under concurrency), and Parallel Benchmark (sustained performance) |
| **mz-profile** | Profiling, slow queries, high memory | CPU profiling with samply, memory profiling with heaptrack, and binary size analysis |
| **query-tracing** | Tracing query execution time | Distributed tracing with OpenTelemetry/Tempo to understand where time goes in SQL statement execution |
| **mz-benchmark** | Adding or debugging benchmark scenarios | Three measurement frameworks: Feature Benchmark (micro-benchmarks), Scalability Test (throughput under concurrency), Parallel Benchmark (sustained latency) |
| **mz-profile** | Something is slow or using too much memory | CPU profiling with samply, memory profiling with heaptrack, binary size analysis |
| **mz-query-tracing** | Understanding where time goes in SQL execution | Distributed tracing with OpenTelemetry/Tempo to get a latency breakdown of any SQL statement |

## Specialized Test Frameworks

## Test Frameworks
These skills provide deep guidance for specific test frameworks. For help
choosing which framework to use, start with **mz-test**.

| Skill | Trigger | What it does |
| Skill | When to use | What it does |
|---|---|---|
| **platform-checks** | Writing upgrade/restart checks | "Write once, run everywhere" framework for testing feature survival across restarts and upgrades |
| **parallel-workload** | Concurrent stress testing | Runs random SQL actions concurrently to catch panics and unexpected errors |
| **limits-test** | Stress-testing with many objects | Catches regressions (panics, stack overflows, OOMs) when scaling to large numbers of tables, views, sources, etc. |
| **mz-platform-checks** | Testing feature survival across restarts/upgrades | Write Check classes with initialize/manipulate/validate phases that run across upgrade and restart scenarios |
| **mz-parallel-workload** | Testing for panics under concurrent SQL | Extend the stress-testing framework that runs random SQL actions concurrently to catch panics and unexpected errors |
| **mz-limits-test** | Stress-testing with large numbers of objects | Add Generator subclasses that scale tables, views, sources, etc. to catch panics, stack overflows, and OOMs |

## Architecture

| Skill | Trigger | What it does |
| Skill | When to use | What it does |
|---|---|---|
| **adapter-guide** | Working on adapter, coordinator, pgwire | Correctness invariants and architectural notes for the adapter layer, timestamp oracle, peek paths, and related crates |
| **mz-adapter-guide** | Working on or asking about the adapter layer | Correctness invariants and architectural notes for the coordinator, pgwire, peek paths, timestamp oracle, and related crates |
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
name: adapter-guide
name: mz-adapter-guide
description: >
This skill should be used when the user works on the adapter layer,
coordinator, pgwire, frontend peek path, peek client, timestamp oracle,
batching oracle, or related crates. Trigger when the user mentions or edits
files in src/adapter/, src/pgwire/, src/timestamp-oracle/, or mentions
frontend, coordinator, read holds, timestamp selection, timestamp oracle,
batching oracle, or query sequencing.
Correctness invariants and architectural guidance for the adapter layer,
coordinator, pgwire, peek paths, and timestamp oracle. Trigger when the user
works on or asks questions about these subsystems — including "how does the
coordinator work", "what are read holds", "explain the peek path", "how does
timestamp selection work", "why does this query block". Also trigger when
editing files in src/adapter/, src/pgwire/, or src/timestamp-oracle/.
---

# Adapter Guide Skill
Expand Down
16 changes: 9 additions & 7 deletions .claude/skills/mz-benchmark/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
---
name: mz-benchmark
description: >
This skill should be used when the user wants to add, modify, or debug a
benchmark scenario for performance testing. Trigger when the user mentions
Add, modify, or debug benchmark scenarios for measuring Materialize
performance. Covers three frameworks: Feature Benchmark (single-operation
micro-benchmarks), Scalability Test (SQL throughput under concurrency), and
Parallel Benchmark (sustained latency over time via scenarios.py). Trigger on
"benchmark", "feature benchmark", "scalability test", "parallel benchmark",
"performance regression", "micro-benchmark", "TPS", "latency test", or wants
to measure how fast a SQL operation runs. Also trigger when the user edits
files in misc/python/materialize/feature_benchmark/scenarios/,
misc/python/materialize/scalability/workload/workloads/, or
misc/python/materialize/parallel_benchmark/scenarios.py.
"performance regression", "micro-benchmark", "TPS", "latency test", or when
editing files in feature_benchmark/scenarios/, scalability/workload/workloads/,
or parallel_benchmark/scenarios.py. Note: this is about benchmark measurement
frameworks, not the parallel-workload stress-testing framework (which tests for
panics under concurrency, not performance).
---

# Benchmark Frameworks
Expand Down
6 changes: 3 additions & 3 deletions .claude/skills/mz-commit/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ name: mz-commit
description: >
This skill should be used when the user wants to "commit", "prepare a commit",
"create a PR", "push", "open a pull request", or mentions committing,
pre-commit checks, pull requests, or code review in the Materialize
repository. Use this skill even if the user just says "ship it" or
"ready to merge" without being specific.
pre-commit checks, or pull requests in the Materialize repository. Use this
skill even if the user just says "ship it" or "ready to merge" without being
specific. Note: for reviewing code, use mz-pr-review instead.
---

# Committing in Materialize
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
---
name: debug-ci
description: Investigate CI failures on a PR using gh and bk CLI tools. Triggers when asked about failing checks, Buildkite failures, or CI issues on a PR.
name: mz-debug-ci
description: >
Investigate CI failures on a PR using gh and bk CLI tools. Trigger when the
user asks about failing checks, Buildkite failures, or CI issues — including
casual phrases like "why is CI red", "build broken", "checks failing", "what
went wrong in CI", "nightly broke", "tests failing on this PR", or pastes a
Buildkite URL. Also trigger when the user mentions a specific PR number and
wants to understand why it's failing.
argument-hint: <PR number or GitHub PR URL>
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: limits-test
name: mz-limits-test
description: >
This skill should be used when the user wants to add, modify, or debug a limits
test. Trigger when the user mentions "limits test", "Generator subclass",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
name: parallel-workload
name: mz-parallel-workload
description: >
This skill should be used when the user wants to extend the parallel-workload
test framework, add a new action, modify existing actions, or add SQL coverage
to parallel workload. Trigger when the user mentions "parallel workload",
"parallel-workload", "action.py" in the context of parallel workload,
or wants to test for panics or unexpected query errors under concurrency.
Use this skill even if the user just says "add this to parallel workload" or
references a bug that panics under concurrent DDL/DML.
Extend the parallel-workload stress-testing framework, which runs random SQL
actions concurrently to catch panics and unexpected errors (not performance
measurement — see mz-benchmark for that). Trigger when the user mentions
"parallel workload", "parallel-workload", "action.py" in the context of
parallel workload, or wants to test for panics or unexpected query errors under
concurrency. Use this skill even if the user just says "add this to parallel
workload" or references a bug that panics under concurrent DDL/DML.
---

# Extending Parallel Workload
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: platform-checks
name: mz-platform-checks
description: >
This skill should be used when the user wants to create, modify, or debug a
platform check. Trigger when the user mentions "platform check", "platform-checks",
Expand Down
11 changes: 8 additions & 3 deletions .claude/skills/mz-pr-review/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
---
description: >
Local PR review against Materialize standards.
This skill should be used when the user asks for a review of changes.
name: mz-pr-review
description: >
Perform a local code review of the current branch's changes against Materialize
project standards. Trigger when the user says "review my code", "review my
changes", "check my diff", "does this look ok", "what do you think of this
PR", "code review", or asks you to look over changes before merging. Also
trigger when the user passes a PR number and wants feedback on quality, style,
or correctness.
argument-hint: [base-branch]
allowed-tools: [Bash, Read, Grep, Glob, Task]
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
name: query-tracing
name: mz-query-tracing
description: >
This skill should be used when the user wants to debug or trace where time is
spent during query execution in Materialize. Trigger when the user mentions
tracing queries, span analysis, query latency breakdown, Tempo traces,
trace IDs, or wants to understand where time goes in CREATE TABLE, SELECT,
INSERT, DROP, or any other SQL statement execution.
Debug where time is spent during SQL execution using distributed tracing
(OpenTelemetry / Tempo). Trigger when the user asks "why is this query slow",
"where is the time going", "this SELECT takes forever", or wants a latency
breakdown for any SQL statement. Also trigger on mentions of tracing queries,
span analysis, Tempo traces, trace IDs, or opentelemetry_filter.
---

# Query Tracing Skill
Expand Down Expand Up @@ -164,7 +164,7 @@ parent span IDs are base64-encoded, not hex. The analysis script handles this.
Use the `trace_tree.py` script in this skill directory to analyze traces:

```bash
python3 .claude/skills/query-tracing/trace_tree.py /tmp/claude-1000/trace.json "My Query"
python3 .claude/skills/mz-query-tracing/trace_tree.py /tmp/claude-1000/trace.json "My Query"
```

## Interpreting Results
Expand Down
11 changes: 7 additions & 4 deletions .claude/skills/mz-test/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
---
name: mz-test
description: >
This skill should be used when the user wants to "run tests", "run testdrive",
General guide for running tests and choosing the right test framework in
Materialize. Trigger when the user wants to "run tests", "run testdrive",
"run sqllogictest", "run mzcompose", "run cargo test", "run pgtest",
"rewrite test results", "add a test", "reproduce a bug", "write a regression
test", or mentions testing, testdrive, sqllogictest, mzcompose, pgtest,
cargo test, nextest, flaky tests, or test failures in the Materialize
repository. Use this skill even if the user just says "test this" or
"how do I verify this works" without naming a specific framework.
cargo test, nextest, flaky tests, or test failures. Use this skill even if the
user just says "test this" or "how do I verify this works" without naming a
specific framework. For deep guidance on specific frameworks, see the dedicated
skills: mz-platform-checks (upgrade/restart survival), mz-parallel-workload
(concurrent stress testing), and mz-limits-test (scaling to many objects).
---

# Testing Materialize
Expand Down
6 changes: 6 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Materialize

## Skills

This repo has Materialize-specific skills in `.claude/skills/`. Before starting
a task, check if a relevant `mz-*` skill exists — they encode project-specific
conventions and save significant time.

## Code navigation

When tracing how an operation flows through the codebase, read these files first:
Expand Down
Loading