Skip to content

[Bug] bit_length and octet_length error natively for BinaryType input instead of falling back #4464

@andygrove

Description

@andygrove

Describe the bug

bit_length and octet_length are wired as plain CometScalarFunction("bit_length") / CometScalarFunction("octet_length") in QueryPlanSerde.scala with no BinaryType guard, so both report Compatible(None) for BinaryType input. However DataFusion's BitLengthFunc and OctetLengthFunc use a Signature::coercible(... logical_string() ...) and reject Binary at execution time. The net effect: bit_length(<binary>) and octet_length(<binary>) plan successfully under Comet, then surface as a native execution error rather than falling back cleanly to Spark.

For contrast, length (also handled by Comet) explicitly guards BinaryType in CometLength.getSupportLevel and falls back to Spark.

Surfaced by the string-expressions audit in #4461.

Steps to reproduce

CREATE TABLE t(b binary) USING parquet;
INSERT INTO t VALUES (X'48656c6c6f');
SELECT bit_length(b) FROM t;
SELECT octet_length(b) FROM t;

Spark: returns 40 and 5.
Comet: native execution error from DataFusion's BitLengthFunc / OctetLengthFunc signature check.

Expected behavior

Either guard BinaryType in dedicated CometBitLength / CometOctetLength serdes (mirroring CometLength), or wire to a Comet-side UDF that supports both Utf8 and Binary (the underlying arrow::compute::bit_length / length kernels do support Binary natively).

Additional context

  • Wiring: QueryPlanSerde.scala lines 176 (bit_length) and 187 (octet_length).
  • Existing guard pattern: CometLength in spark/src/main/scala/org/apache/comet/serde/strings.scala.
  • Spark accepts (StringType|BinaryType) -> IntegerType for both expressions across 3.4.3, 3.5.8, and 4.0.1.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions