Describe the bug
Spark's spark.sql.legacy.castComplexTypesToString.enabled configuration switches the format used by CAST(<struct|array|map> AS STRING):
- Default (
false): structs render as {a, b, c}, arrays as [1, 2, 3], nulls as null.
- Legacy (
true): structs render as [a, b, c], arrays as [1, 2, 3], nulls render as "" (empty string).
Comet's native struct/array-to-string formatter hard-codes the default-mode brackets and uses cast_options.null_string = "null". The legacy_cast_complex_to_string flag is not currently plumbed through the cast proto, so users who set spark.sql.legacy.castComplexTypesToString.enabled=true will see Comet output differ from Spark for CAST(<complex> AS STRING).
Surfaced by the cast audit (collection PR queue).
Steps to reproduce
SET spark.sql.legacy.castComplexTypesToString.enabled = true;
SELECT CAST(struct(1, 2, null) AS STRING);
-- Spark legacy: [1,2,]
-- Comet: {1, 2, null}
Expected behavior
Either:
- Plumb the conf through the Cast proto and switch the native formatter on
legacy_cast_complex_to_string, OR
- Downgrade
(StructType|ArrayType|MapType, StringType) casts to Incompatible(Some(...)) when the conf is enabled.
Additional context
- Native impl:
native/spark-expr/src/conversion_funcs/cast.rs (struct/array string formatters)
- Comet matrix:
CometCast.canCastToString
- Spark conf:
SQLConf.LEGACY_COMPLEX_TYPES_TO_STRING
Describe the bug
Spark's
spark.sql.legacy.castComplexTypesToString.enabledconfiguration switches the format used byCAST(<struct|array|map> AS STRING):false): structs render as{a, b, c}, arrays as[1, 2, 3], nulls asnull.true): structs render as[a, b, c], arrays as[1, 2, 3], nulls render as""(empty string).Comet's native struct/array-to-string formatter hard-codes the default-mode brackets and uses
cast_options.null_string = "null". Thelegacy_cast_complex_to_stringflag is not currently plumbed through the cast proto, so users who setspark.sql.legacy.castComplexTypesToString.enabled=truewill see Comet output differ from Spark forCAST(<complex> AS STRING).Surfaced by the cast audit (collection PR queue).
Steps to reproduce
Expected behavior
Either:
legacy_cast_complex_to_string, OR(StructType|ArrayType|MapType, StringType)casts toIncompatible(Some(...))when the conf is enabled.Additional context
native/spark-expr/src/conversion_funcs/cast.rs(struct/array string formatters)CometCast.canCastToStringSQLConf.LEGACY_COMPLEX_TYPES_TO_STRING