[SPARK-56697][SQL][DML] Refactor Catalog Manager#55646
Conversation
a7a29ac to
b4de792
Compare
| // ---- Transactions ---- | ||
| def transaction: Option[Transaction] = None | ||
|
|
||
| def withTransaction(transaction: Transaction): CatalogManager = |
There was a problem hiding this comment.
Is it safe to do by default? Why not throw an exception to be on a safer side?
There was a problem hiding this comment.
Or not implement at all
There was a problem hiding this comment.
Yes better. This is now implemented in the DefaultCatalogManager.
| def setCurrentNamespace(namespace: Array[String]): Unit | ||
|
|
||
| // ---- Session path ---- | ||
| def sessionPathEntries: Option[Seq[CatalogManager.SessionPathEntry]] |
There was a problem hiding this comment.
What about importing SessionPathEntry directly?
There was a problem hiding this comment.
Cleaned it up.
| def currentNamespace: Array[String] | ||
| def setCurrentNamespace(namespace: Array[String]): Unit | ||
|
|
||
| // ---- Session path ---- |
There was a problem hiding this comment.
@cloud-fan, could you help review this, please?
aokolnychyi
left a comment
There was a problem hiding this comment.
It looks good to me, but I would want @cloud-fan and @gengliangwang to take a look too.
46320d4 to
b6035b9
Compare
cloud-fan
left a comment
There was a problem hiding this comment.
This is a clean refactor of the previous TODO ("extract a CatalogManager trait so TACM doesn't carry inherited mutable state it never uses"). The old class CatalogManager becomes a trait, with two concrete implementations: DefaultCatalogManager (owns the session state) and TransactionAwareCatalogManager (decorator). Construction sites move from new CatalogManager(...) to new DefaultCatalogManager(...); all other call sites continue to use CatalogManager as the abstract type. The intended benefit is realized: the compiler now enforces that any new method on the trait is implemented in both classes.
One thing worth surfacing — the PR description says "No user-facing change" but the refactor silently fixes a real bug. In the old code, every withTransaction call constructed a TACM whose super-constructor re-ran v1SessionCatalog.bindCatalogManagerForSessionFunctionKinds(this), rebinding the v1 SessionCatalog's session-function-kinds source to the transient TACM. Because old TACM never overrode sessionFunctionKindsForUnqualifiedResolution, subsequent unqualified function resolution that took the path-kinds route observed TACM's own (always-empty) _currentCatalogName / _currentNamespace / _sessionPath fields — i.e., a "no USE / no SET PATH" view — even after the transaction had ended. The new design binds only in DefaultCatalogManager's constructor, so the binding stays on the long-lived DefaultCM. Worth updating the PR description to call this out; the bug fix is more interesting than the refactor itself.
Also: as a side-effect of the trait extraction, two Scaladoc references in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala (lines 124 and 138) now point to [[CatalogManager]]'s "constructor"; the binding now lives in DefaultCatalogManager's constructor. Worth a one-line touch-up in this PR or a follow-up.
| private[sql] class DefaultCatalogManager( | ||
| val defaultSessionCatalog: CatalogPlugin, | ||
| val v1SessionCatalog: SessionCatalog) extends SQLConfHelper with Logging { | ||
| val v1SessionCatalog: SessionCatalog) extends CatalogManager { |
There was a problem hiding this comment.
The constructor-vals here (and val tempVariableManager at line 122) implement abstract def members of the trait without the override modifier. Scala allows omitting override for abstract overrides, but every other method in this class uses override def, and adding it on these three makes the implements-from-trait relationship explicit (and prevents accidental shadow-definition if the abstract is later removed from the trait).
| override def sessionFunctionKindsForUnqualifiedResolution(): Seq[SessionFunctionKind] = | ||
| delegate.sessionFunctionKindsForUnqualifiedResolution() | ||
|
|
||
| override private[sql] def reset(): Unit = delegate.reset() |
There was a problem hiding this comment.
Worth a one-line comment: this is a behavior change versus the pre-refactor code. The old reset() was inherited from CatalogManager and cleared TACM's own (separately initialized) super-state without touching the delegate. The new override clobbers the underlying DefaultCatalogManager — clears its catalogs map, session path, current catalog/namespace, etc. reset() is test-only and not currently called on a TACM, so this is harmless today, but a future test reaching for it would silently affect session-level state outside the transaction.
There was a problem hiding this comment.
I would omit the comment because the base behaviour was only introduced a few days ago. This PR is a direct follow up to fix issues of the base PR.
What changes were proposed in this pull request?
Currently, the
TransactionAwareCatalogManageris a decorator of theCatalogManager. This introduces some risk because newly introduced methods in theCatalogManagermight not be implemented inTransactionAwareCatalogManager. To resolve this issue we create a newCatalogManagertrait that allCatalogManagerimplementations can implement.Furthermore, the PR fixes an issue where previously every
withTransactioncall constructed a TACM whose super-constructor re-ranv1SessionCatalog.bindCatalogManagerForSessionFunctionKinds(this), rebinding the v1 SessionCatalog's session-function-kinds source to the transient TACM. Because old TACM never overrode sessionFunctionKindsForUnqualifiedResolution, subsequent unqualified function resolution that took the path-kinds route observed TACM's own (always-empty)_currentCatalogName / _currentNamespace / _sessionPath fields- i.e., a "no USE / no SET PATH" view — even after the transaction had ended. The new design binds only inDefaultCatalogManager's constructor, so the binding stays on the long-lived DefaultCM.Why are the changes needed?
Ensures all newly introduced methods in the CatalogManager are going to be implemented in both the
DefaultCatalogManagerand theTransactionAwareCatalogManager.Does this PR introduce any user-facing change?
Yes. Fixes an issue described in the description above.
How was this patch tested?
Refactor PR. Used existing tests.
Was this patch authored or co-authored using generative AI tooling?
Claude Sonnet 4.6.