Skip to content

Add high-level Rust API and rework Python bindings#89

Merged
georgestagg merged 12 commits intomainfrom
rust-api
Feb 3, 2026
Merged

Add high-level Rust API and rework Python bindings#89
georgestagg merged 12 commits intomainfrom
rust-api

Conversation

@georgestagg
Copy link
Collaborator

  • Adds src/api.rs with a two-stage API: prepare() -> render()
  • Reworks Python bindings
  • Updates CLI and REST handlers to use new API
  • Supports custom Python readers via duck-typed execute(sql) -> DataFrame interface (see docs in PR)
  • Removes split_query from public APIs

Rust API:

  let reader = DuckDBReader::from_connection_string("duckdb://memory")?;
  let prepared = ggsql::prepare("SELECT * FROM data VISUALISE x, y DRAW point", &reader)?;
  let json = prepared.render(&VegaLiteWriter::new())?;                                                                                                                                                                                                                        

Python API:

import ggsql
import polars as pl

reader = ggsql.DuckDBReader("duckdb://memory")
df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})
reader.register("data", df)
prepared = ggsql.prepare("SELECT * FROM data VISUALISE x, y DRAW point", reader)
prepared.render(ggsql.VegaLiteWriter())

@georgestagg georgestagg requested a review from cpsievert January 29, 2026 13:21
@cpsievert
Copy link
Collaborator

cpsievert commented Jan 30, 2026

This is great, thanks for this!

I have some ideas sketched out in #90, mostly driven by how I'd imagine Python users wanting things. With that approach, I think we can get away with a reduced, singular API:

import ggsql
import polars as pl

df = pl.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})

reader = ggsql.readers.DuckDB("duckdb://memory")
ggspec = reader.execute(
    "SELECT * FROM data VISUALISE x, y DRAW point", 
    {"data": df}
)

# return an altair chart
writer = ggsql.writers.VegaLite()
writer.render_chart(ggspec)

# or a string/dict for more control
writer.render(ggspec)

That said, feel free to merge here if you like, and we can iterate on #90 as a follow up (it's very much a WIP at this point).

BTW, here are a few notes about the current API that seem suboptimal:

  1. Something like reader.prepare(query) would feel more Pythonic than ggsql.prepare(query, reader), but maybe you have good not reasons to not do that in Rust?

  2. IMO the name .prepare() doesn't invoke a sense of "this thing executes the query". This lead to the name change to .execute() in Refactor API: Writer-centric rendering and Python module restructure #90

  3. It feels important that a writer can output to different types (e.g., str, dict, altair.Chart, etc), thus the change from prepared.render() to writer.render_foo(). It also feels generally more extensible if the writer consumes the spec?

  4. It feels important that we have a .unregister() as well as a register+execute+unregister like pl.SQLContext.

@georgestagg
Copy link
Collaborator Author

georgestagg commented Feb 2, 2026

Since something has gone wrong with the Reader in #90, let me rework this with your suggestions and what's left can go in an updated #90.

mostly driven by how I'd imagine Python users wanting things.

The Rust side of the API is consumed not just by the Python package, but also a collection of other Rust things (and a future R package). I am very happy for you to make the interface exposed by the Python package to users as Pythonic as possible, but we must be sure the nature does not leak too much into the Rust backend.

That said, these do look like good suggestions in general. Let me work through them, and we'll see where we land.

@georgestagg
Copy link
Collaborator Author

It feels important that a writer can output to different types (e.g., str, dict, altair.Chart, etc), thus the change from prepared.render() to writer.render_foo().

This is the only part I'm not sure about also reflecting onto the Rust side. I think for this PR I'm going to keep a generic render() function on the Rust trait that can potentially return anything depending on the writer (including None, with a side effect of e.g. writing to disk).

On the other hand, I am happy for a re-worked #90 to expand this on the Python side so that writers are given extra methods like render_altair().

@cpsievert
Copy link
Collaborator

I think for this PR I'm going to keep a generic render() function on the Rust trait

Oh, totally -- that change specifically is for Python.

BTW, I'm good with merging this now. It's probably better that I break my stuff into multiple PRs so it's easier to review.

@georgestagg
Copy link
Collaborator Author

Your suggestions made enough sense to me that I've already decided to make changes :)

I'll tag you for a re-review when I'm done, then we can merge and work from there.

Copy link
Collaborator

@cpsievert cpsievert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@georgestagg
Copy link
Collaborator Author

Okay! So I've flipped the API in the way you suggest, renaming prepare() to execute() and supporting the workflow with writer.render(spec). I think you are right that it is better this way around.

I think I've caught everything in this PR's scope. I also added a little example at the end of test_ggsql.py of using a custom reader with ibis as backend, though I haven't tested the output deeply (just that it runs and the test passes...)

Please can you take another look so that I'm sure that this is along the direction you were thinking of, before I merge?

@georgestagg georgestagg requested a review from cpsievert February 2, 2026 16:52
@cpsievert
Copy link
Collaborator

cpsievert commented Feb 2, 2026

Awesome, thanks for this! Just a couple minor notes that might impact the Rust API:

  • I was hoping for .execute() to optionally take a dictionary of data frames (inspired by polars SQLContext API). Do you feel like that is worth doing?
  • Exposing different classes of exceptions (e.g., ParseError, NoVisualise, etc)?

A couple other, more Python-specific things I've been considering that I'm happy to do as follow-up if you're on board:

  1. Adding type stubs for the Rust bindings (to help with the typing/autocomplete/etc experience)
  2. Related to 1, it would help the dev experience if the Reader class was an ABC/Protocol so it's more obvious what the contract is.
    • I'm OK with punting on this in the short-term, but long-term it would be nice.
  3. Moving the logic for render_altair into a wrapper VegaLiteWriter class with .render_chart() and .render() methods

Also, much longer term, I think we may also want the Reader implementations in separate Python packages (that you install via extras like pip install ggsql[duckdb]). This way, if you wanted to implement your custom Reader that wraps sqlalchemy, etc., you don't such a heavy ggsql wheel.

@georgestagg
Copy link
Collaborator Author

georgestagg commented Feb 3, 2026

A couple other, more Python-specific things I've been considering that I'm happy to do as follow-up if you're on board [...]

All of these sound like good ideas. Please feel free to take them on in follow-up PRs!

I was hoping for .execute() to optionally take a dictionary of data frames (inspired by polars SQLContext API). Do you feel like that is worth doing?

I am happy for this to be the case on the Python side, but I'd like to keep the Rust API Reader.execute() as taking an immutable &self. We'd need to have execute(&mut self, name, df) if we wanted to allow optional data registration because the .register() method requires a mutable reference.

I think for the Python side you could do it in the bridge function:

#[pyfunction]
fn execute(query: &str, reader: &Bound<'_, PyAny>) -> PyResult<PySpec> {
    // Register data if provided
      if let Some(data_dict) = data {
        for (name, df) in data_dict.iter() {
            let name_str: String = name.extract()?;
            reader.call_method1("register", (name_str.as_str(), df))?;
        }
    }  

    [... existing function...]
}

Also, much longer term, I think we may also want the Reader implementations in separate Python packages (that you install via extras like pip install ggsql[duckdb]).

Sounds good. When we think it is time for distributing on PyPI can I get back to you and for assistance deploying/maintaining this?

@georgestagg georgestagg merged commit bbbb2a0 into main Feb 3, 2026
29 checks passed
@cpsievert
Copy link
Collaborator

When we think it is time for distributing on PyPI can I get back to you and for assistance deploying/maintaining this?

Yep!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants