Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup to #53788 which moved the namespace length check from individual command handlers to the CatalogAndIdentifier extractor. That approach is too aggressive: users can extend the session catalog via CatalogExtension and support multi-part namespaces through v2 APIs. The check should only happen when we actually create v1 identifiers like TableIdentifier, not at the shared name-resolution layer.

Changes

  • Remove the namespace check from CatalogAndIdentifier.unapply so it remains a pure name-resolution mechanism, preserving CatalogExtension flexibility.
  • Tighten CatalogV2Implicits.IdentifierHelper.asTableIdentifier and asFunctionIdentifier to require exactly one namespace part and throw REQUIRES_SINGLE_PART_NAMESPACE (instead of the less precise IDENTIFIER_TOO_MANY_NAME_PARTS). This centralizes the validation.
  • Remove the now-redundant V2SessionCatalog.TableIdentifierHelper and use the unified CatalogV2Implicits conversion everywhere.
  • Simplify ResolveSessionCatalog extractors (ResolvedV1Identifier, ResolvedIdentifierInSessionCatalog, ResolvedViewIdentifier) to delegate to ident.asTableIdentifier.
  • Fix SparkSqlParser temp view creation to check name length before calling asTableIdentifier, so the user always sees notAllowedToAddDBPrefixForTempViewError instead of a generic error.

Why are the changes needed?

CatalogExtension allows users to extend the built-in session catalog and potentially support multi-part namespaces for v2 operations. The early check in CatalogAndIdentifier would block such extensions. The namespace length should only be validated when we actually need to create v1 identifiers (e.g. TableIdentifier), which inherently require a single-part namespace.

Additionally, this PR unifies the scattered namespace validation into a single point (CatalogV2Implicits.IdentifierHelper), reducing code duplication and ensuring consistent REQUIRES_SINGLE_PART_NAMESPACE errors.

Does this PR introduce any user-facing change?

Yes. Multi-part namespace identifiers now flow through CatalogAndIdentifier without error, allowing CatalogExtension implementations to handle them. The error is only thrown when the session catalog actually needs to convert to a v1 identifier.

How was this patch tested?

Existing tests updated:

  • LookupCatalogSuite — removed the early-rejection test, added multi-part namespace cases back.
  • V2SessionCatalogSuite — updated to expect REQUIRES_SINGLE_PART_NAMESPACE.
  • DataSourceV2SQLSuiteV1Filter and DataSourceV2FunctionSuite — all 313 tests pass.

Was this patch authored or co-authored using generative AI tooling?

Yes.

Made with Cursor

@cloud-fan
Copy link
Contributor Author

@dongjoon-hyun
Copy link
Member

Could you double-check the following two failures, @cloud-fan ?

Screenshot 2026-02-10 at 12 18 51 PM

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments, mostly lgtm.

i can look at adding some test to mimic CatalogExtension and multi part ns like Iceberg for this behavior if it helps

def asMultipartIdentifier: Seq[String] = (ident.namespace :+ ident.name).toImmutableArraySeq

def asTableIdentifier: TableIdentifier = ident.namespace match {
case ns if ns.isEmpty => TableIdentifier(ident.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious here, is it necessary to remove these if ns is not empty? I see that asTableIdentifierOpt still handles the empty ns case.

I assume in V1SessionCatalog case, the namespace is never empty, but just checking if its necessary to remove, in case some v2 connector is using the method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withIdentClause(ctx.identifierReference(), Seq(qPlan), (ident, otherPlans) => {
val tableIdentifier = ident.asTableIdentifier
if (tableIdentifier.database.isDefined) {
if (ident.length > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this case covered by some test? (its easy to understand though, so not blocking)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exception = intercept[AnalysisException](testIdent.asTableIdentifier),
condition = "IDENTIFIER_TOO_MANY_NAME_PARTS",
parameters = Map("identifier" -> "`a`.`b`.`c`", "limit" -> "2")
condition = "REQUIRES_SINGLE_PART_NAMESPACE",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just opinion that previously having the identifier was a bit easier to debug. I wonder if we can extend the error message to have a version with identifier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, working on it.

…ifier creation

This is a followup to apache#53788 which moved the namespace length check from
individual command handlers to the `CatalogAndIdentifier` extractor.

That approach is too aggressive: users can extend the session catalog via
`CatalogExtension` and support multi-part namespaces through v2 APIs. The
check should only happen when we actually create v1 identifiers like
`TableIdentifier`, not at the shared name-resolution layer.

Changes:
- Remove the namespace check from `CatalogAndIdentifier.unapply` so it
  remains a pure name-resolution mechanism.
- Tighten `CatalogV2Implicits.IdentifierHelper.asTableIdentifier` and
  `asFunctionIdentifier` to require exactly one namespace part and throw
  `REQUIRES_SINGLE_PART_NAMESPACE` (instead of the less precise
  `IDENTIFIER_TOO_MANY_NAME_PARTS`).
- Remove the now-redundant `V2SessionCatalog.TableIdentifierHelper` and
  use the unified `CatalogV2Implicits` conversion everywhere.
- Simplify `ResolveSessionCatalog` extractors (`ResolvedV1Identifier`,
  `ResolvedIdentifierInSessionCatalog`, `ResolvedViewIdentifier`) to
  delegate to `ident.asTableIdentifier`.
- Fix `SparkSqlParser` temp view creation to check name length before
  calling `asTableIdentifier`, so the user always sees
  `notAllowedToAddDBPrefixForTempViewError` instead of a generic error.

Co-authored-by: Cursor <cursoragent@cursor.com>
Remove redundant namespace parameter since identifier already includes
the namespace. The message is now:
"<sessionCatalog> requires a single-part namespace, but got identifier <identifier>."

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants