Skip to content

Refactor API: Writer-centric rendering and Python module restructure#90

Closed
cpsievert wants to merge 7 commits intoposit-dev:rust-apifrom
cpsievert:rust-api-refactor
Closed

Refactor API: Writer-centric rendering and Python module restructure#90
cpsievert wants to merge 7 commits intoposit-dev:rust-apifrom
cpsievert:rust-api-refactor

Conversation

@cpsievert
Copy link
Collaborator

@cpsievert cpsievert commented Jan 30, 2026

This PR builds on #89 to improve the Python package with a cleaner API structure, proper IDE support, and better error handling.

API Tweaks

Before

import ggsql
import polars as pl

df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})

reader = ggsql.DuckDBReader("duckdb://memory")
reader.register("data", df)
prepared = ggsql.prepare("SELECT * FROM data VISUALISE x, y DRAW point", reader)
json_str = prepared.render(ggsql.VegaLiteWriter())

After

import polars as pl
from ggsql.readers import DuckDB
from ggsql.writers import VegaLite

df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})

reader = DuckDB("duckdb://memory")
spec = reader.execute(
    "SELECT * FROM data VISUALISE x, y DRAW point",
    {"data": df}  # Auto-registered, auto-cleaned up
)
writer = VegaLite()
chart = writer.render_chart(spec)  # Returns Altair chart
json_str = writer.render_json(spec)  # Or get raw JSON

A summary of the API changes:

Before After
ggsql.DuckDBReader(conn) ggsql.readers.DuckDB(conn)
ggsql.VegaLiteWriter() ggsql.writers.VegaLite()
ggsql.prepare(query, reader) reader.execute(query, data_dict)
prepared.render(writer) writer.render_json(spec) or writer.render_chart(spec)
reader.execute(sql) reader.execute_sql(sql)
ValueError for all errors Specific exception types
render_altair(df, viz) Use writer.render_chart(spec)

Which will require/inspire some Rust API tweaks

Change Why
Writer::render(&Prepared) Moves rendering responsibility to Writer, enabling Python's writer.render_json(spec) pattern
Reader::execute()execute_sql() Frees up execute name for Python's higher-level execute(query, data_dict) method
Reader::unregister(name) Enables auto-cleanup in Python's execute() - tables are unregistered after query
GgsqlError::NoVisualise Enables Python's NoVisualiseError exception for queries missing VISUALISE clause

Other Python Improvements

1. Full IDE Support with Type Stubs

The new _ggsql.pyi provides complete type information for the native extension:

  • Autocomplete for all methods and parameters
  • Inline documentation in IDE tooltips
  • Type checking with mypy/pyright

2. Proper Exception Hierarchy

try:
    spec = reader.execute(query, {"data": df})
except ggsql.types.NoVisualiseError:
    # Query has no VISUALISE clause - use execute_sql() instead
    df = reader.execute_sql(query)
except ggsql.types.ValidationError as e:
    # Column doesn't exist, missing required aesthetic, etc.
    print(f"Invalid query: {e}")
except ggsql.types.ReaderError as e:
    # SQL execution failed
    print(f"Database error: {e}")
except ggsql.types.GgsqlError as e:
    # Catch-all for any ggsql error
    print(f"Error: {e}")

3. Clean Module Structure

ggsql/
├── readers.py    # DuckDB (more backends later)
├── writers.py    # VegaLite (more formats later)  
├── types.py      # Prepared, Validated, exceptions
└── _ggsql.pyi    # Type stubs for IDE support

4. Auto-Registration with Cleanup

Tables passed to execute() are automatically:

  • Registered before query execution
  • Unregistered after (even on error)

5. Context Manager Support

with DuckDB("duckdb://memory") as reader:
    spec = reader.execute(query, {"data": df})

Commits

  1. Rust API refactor - Writer::render(), execute_sql(), unregister(), NoVisualise error
  2. Python bindings restructure - New modules, exceptions, type stubs
  3. Python tests - Full coverage for new API
  4. Documentation - Updated examples and reference
  5. Formatting fix - cargo fmt
  6. REST API fix - Handle NoVisualise error, clippy warnings

🤖 Generated with Claude Code

cpsievert and others added 4 commits January 29, 2026 19:40
API changes:
- Move render() from Prepared to Writer trait for better separation of concerns
- Rename Reader::execute() to execute_sql() for clarity
- Add Reader::unregister() method for table cleanup
- Add GgsqlError::NoVisualise variant for queries without VISUALISE clause

The Writer now has primary responsibility for rendering, with render()
as the main entry point that delegates to write() internally. This makes
the API more intuitive: writer.render(&prepared) instead of
prepared.render(&writer).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New module structure:
- ggsql.readers: DuckDB reader class with execute() and execute_sql()
- ggsql.writers: VegaLite writer with render_json() and render_chart()
- ggsql.types: Prepared, Validated, and exception classes

Key improvements:
- Proper exception hierarchy: GgsqlError base with ParseError,
  ValidationError, ReaderError, WriterError, NoVisualiseError
- DuckDB.execute() auto-registers/unregisters DataFrames for clean API
- Narwhals integration moved to Python layer for DataFrame conversion
- Type stubs (_ggsql.pyi) for IDE support and type checking
- Context manager support for DuckDB reader
- Removed render_altair() and prepare() - replaced by cleaner two-stage API

Breaking changes:
- ggsql.DuckDBReader -> ggsql.readers.DuckDB
- ggsql.VegaLiteWriter -> ggsql.writers.VegaLite
- ggsql.prepare(query, reader) -> reader.execute(query, data_dict)
- prepared.render(writer) -> writer.render_json(prepared)
- Custom Python readers no longer supported (use DuckDB with registration)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Test coverage for:
- New module paths (ggsql.readers.DuckDB, ggsql.writers.VegaLite)
- execute() with data dict registration and auto-cleanup
- execute_sql() for plain SQL queries
- NoVisualiseError exception handling
- Exception hierarchy (GgsqlError as base)
- Context manager support
- Narwhals DataFrame support (pandas, polars)
- __repr__ methods for debugging
- render_json() and render_chart() methods

Removed tests for:
- render_altair() convenience function (removed from API)
- Custom Python reader support (removed from API)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- CLAUDE.md: Updated Python bindings section with new module structure,
  exception hierarchy, and API examples
- src/doc/API.md: Updated Rust and Python API reference to reflect
  Writer::render() pattern and new Python module paths
- ggsql-python/README.md: Complete rewrite with new API, examples for
  execute() with data dicts, exception handling, and narwhals support

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
cpsievert and others added 3 commits January 29, 2026 19:51
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Writer trait import for render() method
- Handle NoVisualise variant in error response mapping
- Fix collapsible-str-replace: combine consecutive replace() calls
- Allow vec_init_then_push for feature-flag dependent version handler

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Reader abstract base class defines the interface that custom readers
must implement:
- execute(query, data) - Execute ggsql query with optional data registration
- execute_sql(sql) - Execute plain SQL and return DataFrame
- register(name, df) - Register a DataFrame as a table
- unregister(name) - Unregister a table

DuckDB now inherits from Reader, providing a complete reference implementation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cpsievert cpsievert marked this pull request as draft January 30, 2026 02:17
@cpsievert
Copy link
Collaborator Author

cpsievert commented Jan 30, 2026

@georgestagg something went way off the rails with the Reader class refactoring/API.

Before I spend time untangling the mess, does the overall proposal in the PR description seem sensible to you?

Additionally, would it be off base to think we should allow a custom Reader to be implemented in Python without Rust (i.e., simple wrapper around sqlalchemy, Ibis, etc)?

@georgestagg
Copy link
Collaborator

georgestagg commented Feb 2, 2026

Additionally, would it be off base to think we should allow a custom Reader to be implemented in Python without Rust (i.e., simple wrapper around sqlalchemy, Ibis, etc)?

This should have already been possible in #89, right?

import ggsql
import polars as pl

class CSVReader:
    """Custom reader that loads data from CSV files."""

    def __init__(self, data_dir: str):
        self.data_dir = data_dir

    def execute(self, sql: str) -> pl.DataFrame:
        # Simple implementation: ignore SQL and return fixed data
        # A real implementation would parse SQL to determine which file to load
        return pl.read_csv(f"{self.data_dir}/data.csv")

# Use custom reader with prepare()
reader = CSVReader("/path/to/data")
prepared = ggsql.prepare(
    "SELECT * FROM data VISUALISE x, y DRAW point",
    reader
)
writer = ggsql.VegaLiteWriter()
json_output = prepared.render(writer)

return nw_df.to_polars()


class DuckDB(Reader):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there is a performance implication when crossing the Rust/Python barrier. So it might not be a good idea to thinly wrap the duckdb reader like this.

@cpsievert
Copy link
Collaborator Author

This should have already been possible

Yea, I missed that on first pass. I'll have to think on it a bit more.

I'm gonna close this PR in favor of a set of more focused PRs.

@cpsievert cpsievert closed this Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants