Skip to content

[#11262] fix: Clean up orphaned schema entities after table or view drop#11362

Open
roryqi wants to merge 4 commits into
apache:mainfrom
qqqttt123:worktree-fix-11262-drop-schema-hierarchical
Open

[#11262] fix: Clean up orphaned schema entities after table or view drop#11362
roryqi wants to merge 4 commits into
apache:mainfrom
qqqttt123:worktree-fix-11262-drop-schema-hierarchical

Conversation

@roryqi
Copy link
Copy Markdown
Contributor

@roryqi roryqi commented Jun 2, 2026

What changes were proposed in this pull request?

This PR cleans up Gravitino schema entities that become orphaned when the underlying catalog automatically removes empty parent namespaces after dropping a table, view, or namespace.

The patch:

  • adds shared orphan schema entity cleanup logic
  • applies the cleanup from Gravitino schema, table, and view drop paths
  • applies the cleanup from Iceberg REST namespace, table, and view drop hooks
  • preserves Iceberg REST table/view backend-state reconciliation added on main
  • adds regression tests for core dispatchers and Iceberg REST hook dispatchers

Why are the changes needed?

Some catalogs, including Iceberg JdbcCatalog, may implicitly remove empty namespaces and their empty parent namespaces after the last table/view/sub-namespace is dropped. Gravitino could still keep the corresponding schema entities in its store, leaving stale metadata.

Fix: #11262

Does this PR introduce any user-facing change?

No new API, configuration, or user-facing behavior is introduced. This fixes stale internal metadata cleanup after drop operations.

How was this patch tested?

  • Core dispatcher unit tests
  • Iceberg REST hook dispatcher unit tests
  • git diff --check

@roryqi roryqi force-pushed the worktree-fix-11262-drop-schema-hierarchical branch from 60e98e7 to f278cd9 Compare June 2, 2026 11:44
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Code Coverage Report

Overall Project 66.82% +0.13% 🟢
Files changed 75.76% 🟢

Module Coverage
aliyun 1.72% 🔴
api 46.82% 🟢
authorization-common 85.96% 🟢
aws 3.66% 🔴
azure 2.47% 🔴
catalog-common 10.04% 🔴
catalog-fileset 80.33% 🟢
catalog-glue 66.08% 🟢
catalog-hive 79.55% 🟢
catalog-jdbc-clickhouse 80.02% 🟢
catalog-jdbc-common 45.31% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.29% 🟢
catalog-jdbc-starrocks 78.51% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 44.89% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 85.66% 🟢
catalog-lakehouse-paimon 79.29% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.91% 🟢
common 49.99% 🟢
core 82.48% -0.03% 🟢
filesystem-hadoop3 76.97% 🟢
flink 0.0% 🔴
flink-common 41.2% 🟢
flink-runtime 0.0% 🔴
gcp 14.12% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 53.26% 🟢
iceberg-common 56.54% 🟢
iceberg-rest-server 72.83% -1.17% 🟢
idp-basic 85.99% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 20.83% 🔴
lance-rest-server 60.27% 🟢
lineage 53.02% 🟢
optimizer 82.95% 🟢
optimizer-api 21.95% 🔴
server 85.73% 🟢
server-common 73.13% 🟢
spark 32.79% 🔴
spark-common 39.75% 🔴
trino-connector 39.44% 🔴
Files
Module File Coverage
core ViewOperationDispatcher.java 83.27% 🟢
SchemaOperationDispatcher.java 82.29% 🟢
TableOperationDispatcher.java 80.49% 🟢
SchemaEntityCleaner.java 79.17% 🟢
iceberg-rest-server IcebergNamespaceHookDispatcher.java 92.65% 🟢
IcebergViewHookDispatcher.java 88.0% 🟢
IcebergTableHookDispatcher.java 81.25% 🟢
IcebergOrphanSchemaCleanup.java 78.57% 🟢
RESTService.java 0.0% 🔴

@roryqi roryqi marked this pull request as draft June 2, 2026 12:56
roryqi and others added 2 commits June 3, 2026 02:37
…splitSchemaName

Replace raw split(Pattern.quote(separator)) calls in IcebergNamespaceHookDispatcher
with the shared HierarchicalSchemaUtil.splitSchemaName util and drop the now-unused
java.util.regex.Pattern import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…best-effort and de-duplicate

- Make SchemaEntityCleaner.deleteOrphanedSchemaEntities best-effort: it now logs
  and swallows store/probe failures instead of throwing, so a transient error
  during the secondary orphan cleanup no longer fails an already-successful
  table/view/schema drop.
- Extract the duplicated best-effort cleanup in IcebergTableHookDispatcher and
  IcebergViewHookDispatcher into a shared IcebergOrphanSchemaCleanup helper.
- Use HierarchicalSchemaUtil.splitSchemaName instead of raw
  split(Pattern.quote(separator)) in the Iceberg table/view drop cleanup.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@roryqi roryqi requested a review from mchades June 3, 2026 06:15
@roryqi roryqi self-assigned this Jun 3, 2026
@roryqi roryqi marked this pull request as ready for review June 3, 2026 06:17
# Conflicts:
#	iceberg/iceberg-rest-server/src/main/java/org/apache/gravitino/iceberg/RESTService.java
@roryqi
Copy link
Copy Markdown
Contributor Author

roryqi commented Jun 3, 2026

@mchades Could u help me take a look?

Comment on lines +440 to +444
schemaIdent ->
doWithCatalog(
catalogIdent,
c -> c.doWithSchemaOps(s -> s.schemaExists(schemaIdent)),
RuntimeException.class));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we retrieve the results in a single call to listSchemas to avoid multiple invocations of the underlying catalog?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg doesn't support the similar operation now. But this improvement is discussing. Maybe we can optimize this point in the feature.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roryqi roryqi requested a review from mchades June 3, 2026 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug report] dropSchema returns false for hierarchical schema after leaf

2 participants