Skip to content

feat(dsql): enhance query plan explainability with type coercion detection, rewrites, and workflow extraction#162

Open
Morlej wants to merge 5 commits into
awslabs:mainfrom
Morlej:feat/dsql-query-plan-explainability
Open

feat(dsql): enhance query plan explainability with type coercion detection, rewrites, and workflow extraction#162
Morlej wants to merge 5 commits into
awslabs:mainfrom
Morlej:feat/dsql-query-plan-explainability

Conversation

@Morlej
Copy link
Copy Markdown

@Morlej Morlej commented May 8, 2026

Summary

  • Extract Workflow 8 from SKILL.md into references/query-plan/workflow.md (SKILL.md: 334 → 275 LOC)
  • Add type coercion index bypass detection — implicit cast compatibility matrix in plan-interpretation.md, indexed column type queries in catalog-queries.md
  • Add query rewrite referencesquery-rewrites-generic.md (11 patterns: OR-to-IN, subquery unnesting, LEFT-to-INNER JOIN, GROUP BY pushdown, etc.) and query-rewrites-dsql-specific.md (reltuples estimate, split large joins for DP threshold)
  • Add structured trigger criteria, context disambiguation, and routing to the workflow reference
  • Wire rewrites into workflow — loaded at Phase 0, applied at Phase 2

Validation

  • validate-size.py: 275 lines (good, under 300 limit)
  • validate-references.py: 0 broken links, 0 new orphans

Eval Results

Behavioral with-skill vs baseline comparison (full results in tools/evals/databases-on-aws/dsql/query_plan_rewrite_eval_results.md):

Eval Scenario With Skill Baseline Key Delta
200 IN-subquery Full Scan PASS PARTIAL Skill recommends specific rewrite patterns from reference
201 Type coercion index bypass PASS PASS Both identify it; skill adds DSQL-specific B-Tree detail
202 12-table join ordering PASS PARTIAL Skill offers full diagnostic workflow with GUC experiments
203 COUNT(*) timeout PASS FAIL Skill recommends pg_class reltuples; baseline suggests retry
204 Multiple OR to IN PASS PARTIAL Skill identifies pattern from reference
205 GROUP BY after JOIN PASS PARTIAL Skill recommends subquery aggregation

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

🤖 Generated with Claude Code

@Morlej Morlej requested review from a team as code owners May 8, 2026 23:33
Morlej and others added 2 commits May 8, 2026 18:36
…ction and rewrite references

- Add structured trigger phrases and routing criteria for query plan diagnosis
- Add type coercion index bypass detection (implicit cast compatibility matrix)
- Extend catalog queries with indexed column type retrieval
- Add generic SQL rewrite reference (11 patterns: OR-to-IN, subquery unnesting, etc.)
- Add DSQL-specific rewrite reference (reltuples estimate, split large joins for DP threshold)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract Workflow 8 (query plan explainability) from SKILL.md into
  references/query-plan/workflow.md to stay under the 300 LOC limit
- Wire query-rewrites-generic.md and query-rewrites-dsql-specific.md
  into the workflow (Phase 0 load list + Phase 2 evidence gathering)
- Add behavioral evals (query_plan_rewrite_evals.json) covering type
  coercion detection, subquery unnesting, OR-to-IN, GROUP BY pushdown,
  large join splitting, and reltuples estimation
- Add eval results (query_plan_rewrite_eval_results.md) with
  with-skill vs baseline comparison

Validation:
- validate-size.py: 275 lines (good)
- validate-references.py: 0 broken links, 0 new orphans

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Morlej Morlej force-pushed the feat/dsql-query-plan-explainability branch from 8e33741 to 8261713 Compare May 8, 2026 23:36
Morlej and others added 2 commits May 8, 2026 18:38
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@amaksimo amaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few general commets:

  1. We should use positive language throughout (llm can confuse DO with DO NOT when we trim context)
  2. We should try to use RFC language more frequently throughout
  3. We should break up the references in the query-plan folder as some of the files are very long

Comment thread plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md Outdated
Comment thread plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md Outdated
…, RFC keywords

Review feedback from amaksimo:

- Split query-rewrites-generic.md into 11 individual files under
  query-rewrites/ subdirectory to reduce context consumption
- Split query-rewrites-dsql-specific.md into individual files
- Convert monolithic files to index tables pointing to sub-files
- Fix DATEADD() SQL Server syntax → PostgreSQL NOW() - INTERVAL
- Flip negative language ("Do not apply") to positive ("Skip when")
- Add RFC keywords (MUST, SHOULD, MAY) throughout
- Remove psql fallback from workflow.md (enforce MCP usage)
- Update plan-interpretation.md recommendation template with RFC language
- Make Phase 0 loading explicit: MUST for core refs, SHOULD for rewrites

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants