Skip to content

SQL expressions in VISUALISE mappings silently consumed as SQL instead of reporting parse error #256

@cpsievert

Description

@cpsievert

Summary

When a ggsql query contains a SQL expression (like CAST(...)) inside a VISUALISE mapping, the parser does not report an error. Instead, it silently treats the entire query — including the VISUALISE clause — as plain SQL. The validate() result reports valid() = True, errors() = [], but has_visual() = False and visual() = ''.

This makes it very difficult for callers to diagnose what went wrong: the query appears valid but has no visual component, even though the user clearly intended to create a visualization.

Reproducible example

import ggsql

query = """SELECT sex, survived, COUNT(*) AS n FROM titanic GROUP BY sex, survived
VISUALISE sex AS x, n AS y, CAST(survived AS VARCHAR) AS fill
DRAW bar"""

result = ggsql.validate(query)
print("valid:", result.valid())        # True
print("errors:", result.errors())      # []
print("warnings:", result.warnings())  # []
print("has_visual:", result.has_visual())  # False
print("visual:", repr(result.visual()))    # ''
print("sql:", repr(result.sql()))          # entire query including VISUALISE

Expected behavior

ggsql.validate() should report an error or warning when it encounters a non-column-name expression (like CAST(...), function calls, or other SQL expressions) in a VISUALISE or MAPPING position. Something like:

Error: unexpected expression in VISUALISE mapping at 'CAST(survived AS VARCHAR)'.
VISUALISE mappings accept column names only. Move data transformations to the SELECT clause.

At minimum, if valid() returns True but the raw query text contains the VISUALISE keyword and has_visual() returns False, a warning should be surfaced via warnings().

Context

This came up when integrating ggsql with an LLM-powered tool. The LLM frequently generates queries like this — putting CAST() inside VISUALISE instead of in the SELECT clause. The silent failure makes it hard to provide useful error feedback to guide the LLM toward a corrected query.

The correct version of the query above would be:

SELECT sex, CAST(survived AS VARCHAR) AS survived, COUNT(*) AS n FROM titanic GROUP BY sex, survived
VISUALISE sex AS x, n AS y, survived AS fill
DRAW bar

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions