I'm working on a setup where we use a python CatalogProvider with register_catalog_provider:
class MyCatalog:
...
ctx.register_catalog_provider('datafusion', MyCatalog())
ctx.sql(...)
This results in a call stack that goes python -> rust -> python and back. As a result, if an error is raised by MyCatalog, it gets badly mangled before being reraised (for example by ctx.sql):
DataFusion error: Execution("PyErr { type: <class 'internal.CatalogClientError'>, value: CatalogClientError('Table \".nonexistant_table\" not found...')"
There's no way to recover anything useful from this exception without string-parsing.
To fix this, we'd probably need to add DataFusionError::Ffi(Box<dyn Error>) upstream, then construct it here:
|
InnerDataFusionError::Execution(format!("{e:?}")) |
Then, we could check for it here, and, if it matches, potentially return the original PyErr unchanged:
https://github.com/apache/datafusion-python/blob/f0bbad7543717c5f08ba2acb92d42c9d30fd2355/src/errors.rs
I haven't tested this approach, but if it sounds reasonable I could give it a shot.
I'm working on a setup where we use a python
CatalogProviderwithregister_catalog_provider:This results in a call stack that goes python -> rust -> python and back. As a result, if an error is raised by
MyCatalog, it gets badly mangled before being reraised (for example byctx.sql):There's no way to recover anything useful from this exception without string-parsing.
To fix this, we'd probably need to add
DataFusionError::Ffi(Box<dyn Error>)upstream, then construct it here:datafusion-python/src/errors.rs
Line 96 in f0bbad7
Then, we could check for it here, and, if it matches, potentially return the original
PyErrunchanged:https://github.com/apache/datafusion-python/blob/f0bbad7543717c5f08ba2acb92d42c9d30fd2355/src/errors.rs
I haven't tested this approach, but if it sounds reasonable I could give it a shot.