Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 25 additions & 24 deletions plugins/databases-on-aws/skills/dsql/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,14 @@
---
name: dsql
description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, and SQL compatibility validation. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow."
description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, load data, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, SQL compatibility validation, and bulk data loading. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow, aurora-dsql-loader, load CSV into DSQL."
license: Apache-2.0
metadata:
tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm
tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm, data-loading
---

# Amazon Aurora DSQL Skill

Aurora DSQL is a serverless, PostgreSQL-compatible distributed SQL database. This skill provides direct database interaction via MCP tools, schema management, migration support, and multi-tenant patterns.

**Key capabilities:**

- Direct query execution via MCP tools
- Schema management with DSQL constraints
- Migration support and safe schema evolution
- Multi-tenant isolation patterns
- IAM-based authentication
Aurora DSQL is a serverless, PostgreSQL-compatible distributed SQL database. This skill covers direct query execution via MCP tools, schema management, migrations, multi-tenant isolation, IAM auth, and bulk data loading via `aurora-dsql-loader`.

---

Expand Down Expand Up @@ -60,6 +52,11 @@ sampled in [.mcp.json](../../.mcp.json)
**When:** Load when debugging errors or unexpected behavior. SHOULD always consult for OCC errors, connection failures, or unexpected query results.
**Contains:** Common pitfalls, error messages, solutions

### [data-loading.md](references/data-loading.md)

**When:** Load when planning or running bulk loads with `aurora-dsql-loader`, or diagnosing slow load times.
**Contains:** Fresh-vs-warm partition behavior, resume/retry mechanics (`--manifest-dir`, `--resume-job-id`), `--on-conflict do-nothing` semantics, schema inference caveats, index-count throughput impact, diagnostic decision tree

### [onboarding.md](references/onboarding.md)

**When:** User explicitly requests to "Get started with DSQL" or similar phrase
Expand Down Expand Up @@ -111,7 +108,7 @@ sampled in [.mcp.json](../../.mcp.json)

### Query Plan Explainability (modular):

**When:** MUST load all four at Workflow 8 Phase 0 — [query-plan/plan-interpretation.md](references/query-plan/plan-interpretation.md), [query-plan/catalog-queries.md](references/query-plan/catalog-queries.md), [query-plan/guc-experiments.md](references/query-plan/guc-experiments.md), [query-plan/report-format.md](references/query-plan/report-format.md)
**When:** MUST load all four at Workflow 9 Phase 0 — [query-plan/plan-interpretation.md](references/query-plan/plan-interpretation.md), [query-plan/catalog-queries.md](references/query-plan/catalog-queries.md), [query-plan/guc-experiments.md](references/query-plan/guc-experiments.md), [query-plan/report-format.md](references/query-plan/report-format.md)
**Contains:** DSQL node types + Node Duration math + estimation-error bands, pg_class/pg_stats/pg_indexes SQL + correlated-predicate verification, GUC experiment procedures + 30-second skip protocol, required report structure + element checklist + support request template

### SQL Compatibility Validation:
Expand Down Expand Up @@ -182,6 +179,7 @@ See [scripts/README.md](../../scripts/README.md) for usage and hook configuratio
1. **Explore:** Use `readonly_query` with `information_schema` to list tables. Use `get_schema` for table structure.
2. **Query:** Use `readonly_query` for SELECT queries. **MUST** include `tenant_id` in WHERE for multi-tenant apps. **MUST** build SQL with `safe_query.build()`.
3. **Schema changes:** Use `transact` with one DDL per transaction. **MUST** batch DML under 3,000 rows. **MUST** use `CREATE INDEX ASYNC` in a separate call. Use `dsql_lint` to validate first.
4. **Bulk load data:** Use `aurora-dsql-loader` for CSV/TSV/Parquet. Load [data-loading.md](references/data-loading.md) for details. Use `--dry-run` first.

---

Expand Down Expand Up @@ -217,31 +215,40 @@ Every DDL statement generated in this workflow MUST be validated with `dsql_lint

**Recovery — batch fails midway:** Rows already updated keep their new value (each batch committed independently). Resume by filtering on the unset state (`WHERE new_column IS NULL`) and continue. Re-running is safe because the filter naturally excludes completed rows.

### Workflow 3: Application-Layer Referential Integrity
### Workflow 3: Bulk Data Loading

Use `aurora-dsql-loader` for CSV, TSV, or Parquet loads. MUST load [data-loading.md](references/data-loading.md) before advising on throughput or diagnosing slow loads.

1. Validate with `--dry-run` first
2. Run with `--manifest-dir` on persistent storage (not `/tmp` — tmpfs on AL2023, lost on crash) and `--header` if file has a header row
3. On failure: resume with `--resume-job-id`; for duplicates use `--on-conflict do-nothing`
4. For large tables: create secondary indexes after load using `CREATE INDEX ASYNC`

### Workflow 4: Application-Layer Referential Integrity

**INSERT:** MUST validate parent exists with readonly_query → throw error if not found → insert child with transact.

**DELETE:** MUST check dependents with readonly_query COUNT → return error if dependents exist → delete with transact if safe.

### Workflow 4: Query with Tenant Isolation
### Workflow 5: Query with Tenant Isolation

1. **MUST** authorize the caller against the tenant — format validation does not establish authorization
2. **MUST** build SQL with [`safe_query.build()`](mcp/tools/safe_query.py) — use `allow()`/`regex()` for
values (emits `'v'`), `ident()` for table/column names (emits `"v"`).
See [input-validation.md](mcp/tools/input-validation.md)
3. **MUST** include `tenant_id` in the WHERE clause; reject cross-tenant access at the application layer

### Workflow 5: Set Up Scoped Database Roles
### Workflow 6: Set Up Scoped Database Roles

MUST load [access-control.md](references/access-control.md) for role setup, IAM mapping, and schema permissions.

### Workflow 6: Table Recreation DDL Migration
### Workflow 7: Table Recreation DDL Migration

DSQL does NOT support direct `ALTER COLUMN TYPE`, `DROP COLUMN`, `DROP CONSTRAINT`, or `MODIFY PRIMARY KEY`. These require the **Table Recreation Pattern**. This is a destructive workflow that requires user confirmation at each step. Every generated DDL in the pattern (CREATE new, INSERT ... SELECT, DROP old, RENAME) MUST be validated with `dsql_lint(sql=..., fix=true)` before execution.

MUST load [ddl-migrations/overview.md](references/ddl-migrations/overview.md) before attempting any of these operations.

### Workflow 7: Validate and Migrate to DSQL
### Workflow 8: Validate and Migrate to DSQL

MUST load [dsql-lint.md](references/dsql-lint.md) before running `dsql_lint` — it defines diagnostic handling, the three `fix_result.status` values (`fixed`, `fixed_with_warning`, `unfixable`), and user-confirmation gates.

Expand All @@ -250,7 +257,7 @@ Run `dsql_lint(sql=source_sql, fix=true)` to validate and auto-convert PostgreSQ
- For MySQL-origin SQL, MUST cross-check the source against [mysql-migrations/type-mapping.md](references/mysql-migrations/type-mapping.md) even when lint returns clean — `ENGINE=` clauses and `SET(...)` column types can pass silently through the PostgreSQL parser.
- On `parse_error`, fall back to [mysql-migrations/type-mapping.md](references/mysql-migrations/type-mapping.md) for manual conversion, then re-run `dsql_lint` on the converted output before executing.

### Workflow 8: Query Plan Explainability
### Workflow 9: Query Plan Explainability

Explains why the DSQL optimizer chose a particular plan. Triggered by slow queries, high DPU, unexpected Full Scans, or plans the user doesn't understand. **REQUIRES a structured Markdown diagnostic report is the deliverable** beyond conversation — run the workflow end-to-end before answering. Use the `aurora-dsql` MCP when connected; fall back to raw `psql` with a generated IAM token (see the fallback block below) otherwise.

Expand Down Expand Up @@ -278,8 +285,6 @@ PGPASSWORD="$TOKEN" psql "host=$HOST port=5432 user=admin dbname=postgres sslmod

**Safety.** Plan capture uses `readonly_query` exclusively — it rejects INSERT/UPDATE/DELETE/DDL at the MCP layer. Rewrite DML to SELECT (Phase 1) rather than asking `transact --allow-writes` to run it; write-mode `transact` bypasses all MCP safety checks. **MUST NOT** run arbitrary DDL/DML or pl/pgsql.

---

## Error Scenarios

- **`awsknowledge` returns no results:** Use the default limits in the table above and note that limits should be verified against [DSQL documentation](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/).
Expand All @@ -288,11 +293,7 @@ PGPASSWORD="$TOKEN" psql "host=$HOST port=5432 user=admin dbname=postgres sslmod
- **Transaction exceeds limits:** Split into batches under 3,000 rows — see [batched-migration.md](references/ddl-migrations/batched-migration.md).
- **Token expiration mid-operation:** Generate a fresh IAM token — see [authentication-guide.md](references/auth/authentication-guide.md). See [troubleshooting.md](references/troubleshooting.md) for other issues.

---

## Additional Resources

- [Aurora DSQL Documentation](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/)
- [Code Samples Repository](https://github.com/aws-samples/aurora-dsql-samples)
- [PostgreSQL Compatibility](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-postgresql-compatibility.html)
- [CloudFormation Resource](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dsql-cluster.html)
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,7 @@ aurora-dsql-loader load \
--table my_table \
--dry-run
```

### When to load the full reference

Load [data-loading.md](../data-loading.md) when diagnosing slow loads, configuring resume/retry, or tuning conflict handling.
166 changes: 166 additions & 0 deletions plugins/databases-on-aws/skills/dsql/references/data-loading.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Data Loading with the DSQL Loader

Part of [DSQL Development Guide](development-guide.md).

The [DSQL Loader](https://github.com/aws-samples/aurora-dsql-loader) (`aurora-dsql-loader`)
is the recommended tool for bulk-loading CSV, TSV, or Parquet data into Aurora DSQL.

For installation and basic invocation, see [connectivity-tools.md](auth/connectivity-tools.md#data-loading-tools).

## Table of Contents

- [Fresh-vs-Warm Partition Behavior](#fresh-vs-warm-partition-behavior)
- [Resume and Retry Mechanics](#resume-and-retry-mechanics)
- [Conflict Handling](#conflict-handling---on-conflict-do-nothing)
- [CSV/TSV Header Handling](#csvtsv-header-handling)
- [Schema Inference Caveats](#schema-inference-caveats)
- [Index Count Affects Throughput](#index-count-affects-throughput)
- [Diagnostic Decision Tree](#diagnostic-decision-tree)

---

## Fresh-vs-Warm Partition Behavior

A DSQL table starts on a single partition. DSQL splits partitions under sustained write heat — no client tuning bypasses this.

- A fresh table absorbs a few thousand rec/s from a single client. Adding concurrency does not help — writes serialize against the single partition.
- Throughput grows as `partitions × per-partition-rate` until the client saturates.
- Splits require **sustained** write volume (10-20 minutes), not bursts.
- Random keys (UUIDs) spread heat; monotonic/sequential keys concentrate it and delay parallelism.

**Key insight:** throughput stuck at a few thousand rec/s on a fresh table is normal. Keep the load running — throughput accelerates as DSQL splits.

For latency-sensitive large loads, run a low-concurrency pre-pass to drive splits before the formal load.

---

## Resume and Retry Mechanics

The loader writes a manifest tracking committed chunks. On resume, it restarts from the last committed chunk.

### `--manifest-dir <persistent-path>`

You **MUST** set `--manifest-dir` to a persistent path. Default `/tmp` is tmpfs on AL2023 — manifests are lost on process death.

```bash
aurora-dsql-loader load \
--endpoint your-cluster.dsql.us-east-1.on.aws \
--source-uri data.csv \
--table my_table \
--manifest-dir /var/lib/dsql-loader/manifests
```

### `--resume-job-id <id>`

Re-runs continue from the last committed chunk. The job id is printed in the loader's log on the line beginning `Starting load job:`.

```bash
aurora-dsql-loader load \
--endpoint your-cluster.dsql.us-east-1.on.aws \
--source-uri data.csv \
--table my_table \
--manifest-dir /var/lib/dsql-loader/manifests \
--resume-job-id <job-id-from-log> \
--keep-manifest
```

### `--keep-manifest`

Retains the manifest after a successful load. Useful for auditing or idempotent re-runs.

---

## Conflict Handling: `--on-conflict do-nothing`

`--on-conflict do-nothing` silently skips rows that violate **any** unique constraint (primary key or any UNIQUE index) on the target table.

The agent **MUST** verify these preconditions before recommending `--on-conflict do-nothing`:

1. The target table **MUST** have at least one unique constraint on the conflict column(s).
2. The load **MUST** be idempotent — the same source row produces the same target row, so skipping duplicates yields the correct final state.
3. The source data **MUST NOT** have changed since the original run if using `do-nothing` for crash recovery. Changed source rows are silently kept at their old values.

**Common pitfall:** duplicate-PK rows in the source are silently dropped — `count(*)` on the target will be lower than the loader's "Records loaded" figure.

---

## CSV/TSV Header Handling

You **MUST** pass `--header` if the CSV/TSV file has a header row. The loader treats every row as data by default.

```bash
aurora-dsql-loader load \
--endpoint your-cluster.dsql.us-east-1.on.aws \
--source-uri sales_with_header.csv \
--table sales \
--header
```

**Symptoms of a missing `--header`:**

- `invalid input syntax for type <T>: "<column_name>"` — header values inserted as data.
- First batch fails entirely while subsequent batches succeed.

**Legacy behavior (v2.x):** older versions defaulted to assuming a header row. If upgrading from v2.x, add `--header` to invocations loading header-bearing files.

---

## Schema Inference Caveats

> **These produce successful loads with no error or warning.** You **MUST** validate with `--dry-run` against any new table.

Schema inference works well for homogeneous, well-typed inputs but silently produces wrong types for:

- **Mixed nullability across files** — column infers as `TEXT` instead of numeric/date.
- **Numeric-looking identifiers** (ZIP codes, phone numbers with leading zeros) — infers as integer, losing leading characters.
- **Non-ISO date formats** — falls back to `TEXT` silently.

```bash
aurora-dsql-loader load \
--endpoint your-cluster.dsql.us-east-1.on.aws \
--source-uri data.csv \
--table my_table \
--dry-run
```

If the inferred schema is wrong, create the table explicitly and re-run without `--if-not-exists`.

---

## Index Count Affects Throughput

Each row written costs `1 + num_indexes` index-entry writes. Tables with many secondary indexes load noticeably slower — and the partition-warming curve is correspondingly slower.

Practical guidance:

- For large loads, **SHOULD** create secondary indexes **after** the bulk load using `CREATE INDEX ASYNC`.
- For tables queried during ingestion, keep indexes in place — throughput cost is preferable to incorrect query results.

---

## Diagnostic Decision Tree

### Symptom: throughput stuck at a few thousand rec/s; host CPU is low

**Cause:** partition-constrained (fresh/few partitions).
**Action:** keep the load running. Throughput accelerates as DSQL splits. For recurring fresh-table loads, run a pre-pass to drive splits.

### Symptom: throughput below expected; host CPU > 90%

**Cause:** host-bound.
**Action:** reduce concurrency (`--workers`, `--batch-concurrency`) or use a larger host.

### Symptom: throughput below expected; host CPU ~50%; persists past 15 minutes

**Cause:** hot-key — many rows hashing to the same partition.
**Action:** inspect source for PK skew. Verify UUIDs are genuinely random (v1 UUIDs share high-order prefix).

### Symptom: "Records loaded" exceeds `SELECT count(*)` on target

**Cause:** duplicate keys in source + `--on-conflict do-nothing`.
**Action:** check source for duplicate-PK rows. De-duplicate or document the gap.

### Symptom: loader crashed; manifest is gone

**Cause:** manifest was in `/tmp` (tmpfs) and cleared on exit.
**Action:** re-run from beginning. If table has a unique constraint and load is idempotent, use `--on-conflict do-nothing` to skip already-committed rows. For future loads, **MUST** set `--manifest-dir` to persistent path.
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ Use for any SQL that was not composed by the agent itself from skill knowledge
4. If **any** diagnostic is `unfixable`, do NOT execute the returned `fixed_sql` — it still contains the unfixable portion verbatim. Collect user-confirmed rewrites from the Unfixable Errors table, merge them into the SQL, then re-run `dsql_lint(fix=true)` on the combined SQL to confirm it is clean.
5. Also surface the `fixed_sql` body itself to the user before executing — prompt-injection can hide inside rewritten statements.
6. Once diagnostics are resolved and the user has acknowledged, split the clean `fixed_sql` on statement boundaries.
7. For destructive DDL (`DROP`, `RENAME`, `TRUNCATE`) confirm with the user before executing, matching Workflow 6's confirmation gate.
7. For destructive DDL (`DROP`, `RENAME`, `TRUNCATE`) confirm with the user before executing, matching Workflow 7's confirmation gate.
8. Execute each DDL with `transact(["<single DDL statement>"])` — one DDL per call.
9. Verify schema with `get_schema`.

Expand Down Expand Up @@ -103,7 +103,7 @@ Only diagnostics with `fix_result.status == "unfixable"` need user-confirmed rew
| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| `create_table_as` | CREATE TABLE with explicit columns, then `INSERT ... SELECT` |
| `truncate` | Use `DELETE FROM table_name` (batch if > 3,000 rows) |
| `unsupported_alter_table_op` | Use Table Recreation Pattern — see [ddl-migrations/overview.md](ddl-migrations/overview.md) and Workflow 6 |
| `unsupported_alter_table_op` | Use Table Recreation Pattern — see [ddl-migrations/overview.md](ddl-migrations/overview.md) and Workflow 7 |
| `add_column_constraint` | ADD COLUMN with name + type only, then backfill via UPDATE. If NOT NULL/DEFAULT required, use Table Recreation Pattern. |
| `index_expression` | Create a computed column, then index that column |
| `index_partial` | Create a full index; filter at query time |
Expand Down
Loading
Loading