Skip to content

in duckdb==1.5.3, DuckDBPyRelation.to_arrow_table() raises InternalException for GEOMETRY('EPSG:xxx') columns #475

@JonAnCla

Description

@JonAnCla

What happens?

I was trying to get ibis working with duckdb 1.5.3 and came across a crash in duckdb that is triggered by some of the ibis geospatial unit tests

I've had copilot put together a minimal reproducer - see below

To Reproduce

"""
Minimal reproducer for DuckDB 1.5.3 regression:
  DuckDBPyRelation.to_arrow_table() raises InternalException when a query
  result contains a GEOMETRY column with an associated EPSG CRS
  (i.e. type reported as GEOMETRY('EPSG:xxxx')).

Root cause (from stack trace):
  CoordinateReferenceSystem::TryConvert() calls
  Transaction::Get(ClientContext, AttachedDatabase) while building the Arrow
  schema, but there is no active transaction at that point.
  ArrowGeometry::WriteCRS -> CoordinateReferenceSystem::TryConvert ->
  Transaction::Get -> "TransactionContext::ActiveTransaction called without
  active transaction" -> INTERNAL assertion.

The workaround (rel.to_arrow_reader().read_all()) succeeds because it goes
through the streaming Arrow IPC path, which serialises the CRS differently and
does not trigger the catalog lookup.

Tested on: duckdb==1.5.3, Python 3.13.13
"""

import json
import os
import tempfile

import duckdb

print(f"duckdb version: {duckdb.__version__}")

con = duckdb.connect()
con.load_extension("spatial")

# Minimal GeoJSON with an explicit EPSG:2263 CRS
# (any EPSG code reproduces the issue; 2263 = NY State Plane Long Island)
geojson = {
    "type": "FeatureCollection",
    "crs": {"type": "name", "properties": {"name": "urn:ogc:def:crs:EPSG::2263"}},
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {"type": "Point", "coordinates": [935996.0, 191376.0]},
        },
        {
            "type": "Feature",
            "properties": {},
            "geometry": {"type": "Point", "coordinates": [935000.0, 190000.0]},
        },
    ],
}

with tempfile.TemporaryDirectory() as tmp:
    path = os.path.join(tmp, "test_epsg.geojson")
    with open(path, "w") as f:
        json.dump(geojson, f)

    rel = con.sql(f"SELECT * FROM ST_READ('{path}')")

    # DuckDB correctly reports the GEOMETRY type with the embedded CRS:
    print(f"Column types: {rel.dtypes}")
    # Expected: [BIGINT, GEOMETRY('EPSG:2263')]

    # Workaround: to_arrow_reader() works fine
    print("\n--- to_arrow_reader().read_all() (workaround) ---")
    tbl = rel.to_arrow_reader().read_all()
    print(f"OK — schema:\n{tbl.schema}\n")

    # Bug: to_arrow_table() raises InternalException
    # Stack:
    #   DuckDBPyResult::FetchArrowTable
    #   ArrowConverter::ToArrowSchema
    #   SetArrowFormat -> SetArrowExtension -> ArrowGeometry::PopulateSchema
    #   ArrowGeometry::WriteCRS
    #   CoordinateReferenceSystem::TryConvert   <-- needs active transaction
    #   Transaction::Get(ClientContext, AttachedDatabase)
    #   "TransactionContext::ActiveTransaction called without active transaction"
    print("--- rel.to_arrow_table() (should raise InternalException) ---")
    try:
        tbl2 = rel.to_arrow_table()
        print(f"Unexpectedly OK: {tbl2.schema}")
    except duckdb.InternalException as e:
        print(f"REGRESSION CONFIRMED — duckdb.InternalException: {e}")

OS:

Linux (Ubuntu 24.04 x86_64)

DuckDB Package Version:

1.5.3

Python Version:

3.13.13

Full Name:

Jonathan Clarke

Affiliation:

FiveSigma Finance

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions