Skip to content

Conversation

@stuxuhai
Copy link
Contributor

@stuxuhai stuxuhai commented Dec 22, 2025

Purpose

This PR fixes a bug where CREATE VIEW IF NOT EXISTS fails with a NoSuchIcebergViewException: Not an iceberg view (wrapped in QueryExecutionException) instead of succeeding silently when a non-Iceberg view (e.g., a Hive view) already exists in the SparkSessionCatalog.

The Problem

When SparkSessionCatalog is configured with spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
spark.sql.catalog.spark_catalog.type=hive

  1. A user executes CREATE VIEW IF NOT EXISTS db.view_name AS ....
  2. If db.view_name already exists as a Hive View (or any non-Iceberg table/view).
  3. SparkSessionCatalog.createView currently delegates directly to the underlying Iceberg catalog (asViewCatalog.createView).
  4. The Iceberg catalog (e.g., HiveCatalog) attempts to load the view. Since it is not an Iceberg view, it throws NoSuchIcebergViewException.
  5. Spark expects ViewAlreadyExistsException to handle the IF NOT EXISTS logic. Because it receives a different exception, the query fails entirely.

The Fix

Before delegating the creation to the Iceberg catalog, we explicitly check if the identifier already exists in the underlying session catalog (which is the source of truth for the global namespace).

If getSessionCatalog().tableExists(ident) returns true, we immediately throw ViewAlreadyExistsException. This allows Spark's analysis rules to correctly catch the exception and ignore the operation as per IF NOT EXISTS semantics.

Verification

  • Added a new unit test in TestSparkSessionCatalog to verify that CREATE VIEW IF NOT EXISTS succeeds when a Hive view exists.
  • Verified that CREATE VIEW (without if not exists) correctly throws AnalysisException (Table or view already exists).

@github-actions github-actions bot added the spark label Dec 22, 2025
@nastra nastra requested review from huaxingao and nastra December 23, 2025 17:50
}

try {
sql("CREATE VIEW IF NOT EXISTS %s AS SELECT 2 AS id", viewName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this exercise SparkSessionCatalog#createView(ViewInfo)? In my environment, CREATE VIEW is planned as V1 CreateViewCommand (not the V2 ViewCatalog path), so the tests may pass without hitting the code change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. The change is intended to be exercised with Iceberg Spark extensions enabled: spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions. I missed documenting this prerequisite in the PR description. Apologies for missing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also verify the test exercises the V2 view path so it actually hits SparkSessionCatalog#createView(ViewInfo)?

@stuxuhai
Copy link
Contributor Author

Apologies for the confusion.
While syncing with the latest apache:main, I force-reset the PR branch, which resulted in this PR being automatically closed. That was not intended.

I’ve recreated the changes in a new PR and will follow up there. Thanks for the review and sorry for the extra noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants