Skip to content

feat: add column and table extraction for complex queries with subqueries#28

Open
label-hook[bot] wants to merge 1 commit into
masterfrom
phoenix/issue-25
Open

feat: add column and table extraction for complex queries with subqueries#28
label-hook[bot] wants to merge 1 commit into
masterfrom
phoenix/issue-25

Conversation

@label-hook
Copy link
Copy Markdown

@label-hook label-hook Bot commented Apr 7, 2026

Summary

Adds functionality to extract column names along with their source table names from complex SQL queries including nested subqueries. This addresses the need to analyze column-to-table relationships in sophisticated SQL statements with JOINs, aliases, and multiple nesting levels.

Changes

  • Added extract_columns_and_tables() function in sqlparse/utils.py that recursively traverses the SQL AST to identify column references and their source tables
  • Added helper functions for identifying column references, table aliases, and building comprehensive column-to-table mappings
  • Created example script examples/extract_columns_and_tables.py demonstrating usage with complex queries including the provided test case
  • Updated existing example examples/extract_table_names.py to reference the new column extraction functionality
  • Added comprehensive test suite tests/test_column_extraction.py covering various SQL patterns including subqueries, JOINs, aliases, and edge cases

Testing

  • All existing tests continue to pass
  • New test suite covers:
    • Simple SELECT statements with column extraction
    • Complex queries with nested subqueries (including the provided example)
    • JOIN operations with table aliases
    • Column aliases and qualified column names
    • Edge cases with CTEs and various SQL dialects
  • Verified correct extraction of columns ('Gender', 'Ethnicity', 'count') and their source tables from the complex example query

Closes #25


Closes #25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai:review Phoenix AI: PR ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to extract columns name along with table name from complex query having subquery

1 participant