Describe the bug
CometCaseConversionBase (parent of CometUpper and CometLower) gates compatibility inside convert(...):
override def convert(expr: T, inputs: Seq[Attribute], binding: Boolean): Option[Expr] = {
if (!CometConf.COMET_CASE_CONVERSION_ENABLED.get()) {
withInfo(expr, "Comet is not compatible with Spark for case conversion in " +
s"locale-specific cases. Set ${CometConf.COMET_CASE_CONVERSION_ENABLED.key}=true " +
"to enable it anyway.")
return None
}
super.convert(expr, inputs, binding)
}
It never overrides getSupportLevel, so the class-level support level defaults to Compatible(None) even though getIncompatibleReasons() is populated. The Step 5 consistency rule in the audit-comet-expression skill calls this out specifically as the canonical antipattern to avoid. Surfaced by the string-expressions audit in #4461.
Concrete consequences:
- The dispatcher's
allowIncompatible machinery is bypassed. A user who flips the standard spark.comet.expr.allowIncompatible=true (or the per-expression spark.comet.expression.Upper.allowIncompatible) still has lower/upper fall back, because the gate is the older bespoke spark.comet.caseConversion.enabled.
- EXPLAIN and the auto-generated compatibility docs disagree. EXPLAIN shows the bespoke message above; the compat doc lists the populated
getIncompatibleReasons() text instead.
initcap uses the standard Incompatible(Some(...)) flow; lower/upper use the bespoke flow. The asymmetry is confusing for users who have just opted into all string incompatibilities.
Expected behavior
Override getSupportLevel to return Incompatible(Some(reason)) from a shared private val, drop the in-convert conf check, and either deprecate spark.comet.caseConversion.enabled in favor of the standard per-expression allowIncompatible flag or have the support level read the standard flag.
Additional context
Describe the bug
CometCaseConversionBase(parent ofCometUpperandCometLower) gates compatibility insideconvert(...):It never overrides
getSupportLevel, so the class-level support level defaults toCompatible(None)even thoughgetIncompatibleReasons()is populated. The Step 5 consistency rule in theaudit-comet-expressionskill calls this out specifically as the canonical antipattern to avoid. Surfaced by the string-expressions audit in #4461.Concrete consequences:
allowIncompatiblemachinery is bypassed. A user who flips the standardspark.comet.expr.allowIncompatible=true(or the per-expressionspark.comet.expression.Upper.allowIncompatible) still haslower/upperfall back, because the gate is the older bespokespark.comet.caseConversion.enabled.getIncompatibleReasons()text instead.initcapuses the standardIncompatible(Some(...))flow;lower/upperuse the bespoke flow. The asymmetry is confusing for users who have just opted into all string incompatibilities.Expected behavior
Override
getSupportLevelto returnIncompatible(Some(reason))from a sharedprivate val, drop the in-convertconf check, and either deprecatespark.comet.caseConversion.enabledin favor of the standard per-expressionallowIncompatibleflag or have the support level read the standard flag.Additional context
CometCaseConversionBaseinspark/src/main/scala/org/apache/comet/serde/strings.scalaCometConf.COMET_CASE_CONVERSION_ENABLED(defaultfalse)