Skip to content

Commit 9ef5d0a

Browse files
timsaucerclaude
andcommitted
refactor(codec): factor shared schema/signature helpers out of UDF encoders
Scalar, aggregate, and window UDF encode/decode bodies in codec.rs each contained the same five blocks: * a `match TypeSignature::Exact { ... } else err` to extract input dtypes, * an `arg_{i}` input-field synthesis with a verbatim 6-line comment, * a `Schema::new(vec![Field::new(...)])` for the return type, * a `schema_from_ipc_bytes(...).first().ok_or_else(...)` decode of the single-field return blob, and * a `parse_volatility(&volatility_str).map_err(...)` round-trip. Six near-identical bodies meant the same comment text lived in three places, and each `map_err(|e| PyValueError::new_err(format!("{e}")))` chain appeared a handful of times per body. Extract: * `signature_input_dtypes(sig, kind)` — `Signature::Exact` extraction with a flavor-tagged error. * `build_input_schema_bytes(&[DataType])` — synth `arg_{i}` fields and write IPC. Carries the comment explaining why field metadata is discarded on decode. * `build_single_field_schema_bytes(&Field)` / `build_schema_bytes(Vec<Field>)` — IPC writer wrappers. * `read_input_dtypes(&[u8])` / `read_single_return_field(&[u8], kind)` — decode side. * `arrow_to_py_err(ArrowError) -> PyErr` and `parse_volatility_str` to collapse the repeated `map_err` chains. Wire format unchanged. Six encode/decode bodies collectively shrink from ~300 to ~140 LOC and stop carrying triplicated comment text that would otherwise drift independently. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent e39311c commit 9ef5d0a

1 file changed

Lines changed: 116 additions & 185 deletions

File tree

0 commit comments

Comments
 (0)