You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: per-session toggle for Python UDF inline encoding
Adds `SessionContext.with_python_udf_inlining(enabled)` for two
related use cases:
* **Cross-language portability.** With inlining disabled, the codec
no longer emits `DFPYUDF1` / `DFPYUDA1` / `DFPYUDW1` cloudpickle
blobs. Python UDFs travel by name only, the same way FFI-capsule
UDFs do. Bytes round-trip through a non-Python decoder.
* **Untrusted-source decode.** `Expr.from_bytes` on bytes from a
misbehaving sender no longer invokes `cloudpickle.loads`. Inline
payloads received by a strict decoder raise a clear error.
`PythonLogicalCodec` and `PythonPhysicalCodec` gain a
`python_udf_inlining: bool` field (default `true`) and a builder
method `with_python_udf_inlining(enabled)`. The six UDF
encode/decode dispatchers consult the flag before calling the inline
helpers. Strict decoders that see a magic-prefix payload return a
clear `Plan` error rather than silently failing through to the inner
codec (which would otherwise produce "LogicalExtensionCodec is not
provided" — accurate but unhelpful).
`PySessionContext::with_python_udf_inlining(enabled)` rebuilds both
codecs with the new setting; Python wrapper at
`SessionContext.with_python_udf_inlining` mirrors. Test coverage:
encoder size delta, strict roundtrip via registry,
clear-error-on-inline-payload-when-strict.
`pickle.loads` on untrusted bytes remains unsafe regardless of this
flag; the toggle only governs the `to_bytes` / `from_bytes` codec
path. User guide documents both use cases plus the limitation.
1097 root tests pass (up from 1094 with 3 new strict-mode cases).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments