Skip to content

feat(data-engineering): add 6 skills, 2 agents, 1 command for data engineers#216

Open
theondrejivan wants to merge 1 commit intoEveryInc:mainfrom
theondrejivan:feat/data-engineering-skills
Open

feat(data-engineering): add 6 skills, 2 agents, 1 command for data engineers#216
theondrejivan wants to merge 1 commit intoEveryInc:mainfrom
theondrejivan:feat/data-engineering-skills

Conversation

@theondrejivan
Copy link

Summary

  • 6 new skills: dbt, snowflake, duckdb, databricks, warehouse-architecture, data-quality — covering the modern data stack with 21 reference files totaling 10K+ lines of patterns, SQL examples, and best practices
  • 2 new agents: dbt-model-reviewer (SQL quality, ref/source, materialization, testing) and data-pipeline-reviewer (idempotency, error handling, credential safety)
  • 1 new command: /data-scaffold — scaffold dbt staging models or dimensional data models with ERDs
  • 2 enhanced agents: performance-oracle (+ warehouse SQL optimization for Snowflake/DuckDB/Databricks) and architecture-strategist (+ data warehouse architecture: Kimball, SCD, medallion)

Bumps version to v2.36.0 (31 agents, 23 commands, 25 skills).

What's Included

Skill Pattern References Coverage
dbt Intake/routing 6 files Project structure, models, testing, Jinja, incremental, packages
snowflake Intake/routing 4 files QUALIFY/FLATTEN, optimization, cost management, Terraform
duckdb Linear 3 files File querying, SQL extensions, Python/dbt integration
databricks Intake/routing 4 files Delta Lake, Unity Catalog, Spark optimization, Terraform
warehouse-architecture Linear (background) 4 files Kimball, Data Vault 2.0, Medallion, SCD types 0-6
data-quality Linear (background) 4 files Pandera, Great Expectations, Soda Core, dbt contracts, anomaly detection

Security

  • All SQL examples use parameterized queries exclusively
  • dbt profiles.yml examples always use {{ env_var() }}
  • Terraform examples include remote state backend and .gitignore
  • Snowflake prefers key-pair auth; Databricks prefers service principal auth
  • Credential detection patterns documented in data-pipeline-reviewer

Test Plan

  • Verify JSON files are valid: cat plugins/compound-engineering/.claude-plugin/plugin.json | jq .
  • Verify counts match: 31 agents, 23 commands, 25 skills
  • Install plugin locally and test: claude /data-scaffold dbt stripe.payments
  • Test agent routing: dbt model prompt → dbt-model-reviewer, not performance-oracle
  • Test skill loading: ask about Snowflake QUALIFY → loads snowflake skill
  • Verify context budget stays under 80%

Post-Deploy Monitoring & Validation

No additional operational monitoring required: all changes are additive plugin content (markdown files) with no runtime infrastructure.


Compound Engineered 🤖 Generated with Claude Code

…gineers

Add comprehensive data engineering capabilities to make this the best
Claude Code plugin for data engineers:

Skills (6 new):
- dbt: project structure, models, testing, Jinja, incremental, packages
- snowflake: SQL patterns, optimization, cost management, Terraform
- duckdb: file querying, SQL extensions, Python/dbt integration
- databricks: Delta Lake, Unity Catalog, Spark optimization, Terraform
- warehouse-architecture: Kimball, Data Vault 2.0, Medallion, SCD types
- data-quality: Pandera, Great Expectations, Soda Core, dbt contracts

Agents (2 new):
- dbt-model-reviewer: SQL quality, ref/source, materialization, testing
- data-pipeline-reviewer: idempotency, error handling, credentials

Command (1 new):
- data-scaffold: scaffold dbt models or dimensional data models

Enhanced existing:
- performance-oracle: added warehouse SQL optimization section
- architecture-strategist: added data warehouse architecture section

Bumps version to 2.36.0 (31 agents, 23 commands, 25 skills).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@theondrejivan
Copy link
Author

Added this to cover modern data engineering needs. Feel free to dismiss if it's not the direction you want this plugin to go.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f692cd7b27

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +66 to +70
5. **Generate source YAML** at `models/staging/<source>/_<source>__sources.yml`:
```yaml
sources:
- name: <source>
description: "TODO: Describe the <source> data source"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Merge source YAML instead of regenerating it

Step 3 checks for an existing source definition, but step 5 still instructs generating _<source>__sources.yml from a single-table template. In an existing dbt project, running /data-scaffold dbt <source>.<table> for another table in the same source would overwrite prior tables: entries and freshness metadata, which is destructive; this flow should append/merge when the file already exists.

Useful? React with 👍 / 👎.

Comment on lines +82 to +86
6. **Generate model YAML** at `models/staging/<source>/_<source>__models.yml`:
```yaml
models:
- name: stg_<source>__<table>
description: "Cleaned <source> <table> with standardized column names"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Append model YAML entries instead of replacing file

Step 6 generates _<source>__models.yml as a fresh document each time, so scaffolding a second model for the same source can erase existing model descriptions and tests in that YAML file. That makes iterative use unsafe in real dbt repos; this should add a new models entry when the file is already present rather than replacing it wholesale.

Useful? React with 👍 / 👎.

Comment on lines +149 to +150
```markdown
```mermaid

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Close the outer markdown fence in model mode example

The ```markdown fence opened before the Mermaid snippet is never closed, so the remaining model-mode guidance is rendered as code block content instead of normal instructions. This can cause steps 5–7 to be treated as literal sample text rather than executable guidance when the command is followed; close the outer fence (or use nested fencing) to restore correct parsing.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant