Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 101 additions & 3 deletions core/src/main/java/org/apache/iceberg/rest/RESTCatalogProperties.java
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
*/
package org.apache.iceberg.rest;

import java.util.Arrays;
import java.util.stream.Collectors;

public final class RESTCatalogProperties {

private RESTCatalogProperties() {}
Expand All @@ -37,12 +40,107 @@ private RESTCatalogProperties() {}

public static final String NAMESPACE_SEPARATOR = "namespace-separator";

// Enable planning on the REST server side
public static final String REST_SCAN_PLANNING_ENABLED = "rest-scan-planning-enabled";
public static final boolean REST_SCAN_PLANNING_ENABLED_DEFAULT = false;
// Configure scan planning mode
// Can be set by server in LoadTableResponse.config() or by client in catalog properties
// Negotiation rules: ONLY beats PREFERRED, both PREFERRED = client wins
// Default when neither client nor server provides: client-preferred
public static final String SCAN_PLANNING_MODE = "scan-planning-mode";
public static final String SCAN_PLANNING_MODE_DEFAULT =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to split out introducing the different planning modes from the option of overriding this at the table level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this more from backward compatibility pov ? asking because we haven't shipped any iceberg java version yet with this config

ScanPlanningMode.CLIENT_PREFERRED.modeName();

public enum SnapshotMode {
ALL,
REFS
}

/**
* Enum to represent scan planning mode configuration.
*
* <p>Can be configured by:
*
* <ul>
* <li>Server: Returned in LoadTableResponse.config() to advertise server preference/requirement
* <li>Client: Set in catalog properties to set client preference/requirement
* </ul>
*
* <p>When both client and server configure this property, the values are negotiated:
*
* <p>Values:
*
* <ul>
* <li>CLIENT_ONLY - MUST use client-side planning. Fails if paired with CATALOG_ONLY from other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using ENFORCED might be a better fit instead of ONLY, wdyt? That explains the intent more naturally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean CLIENT_ENFORCED | CATALOG_ENFORCED, it believe it does more authoritative, since we are including in spec this might be language we prefer, let me think a bit more on this.

Copy link
Contributor

@danielcweeks danielcweeks Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually prefer that we just remove it and have client, client-preferred, catalog-preferred, and catalog.

Using words like ENFORCED or REQUIRED don't quite feel right and ultimately, if we're going with this enumeration, it is explicit.

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a bit skeptical that we even need the notion of preferences, e.g. client-preferred, catalog-preferred. it''s plausible that servers could have more insights to give a more intelligent preference but it feels over complicated compared to just having a "clients-choice" (not a real mode, just something that's inferred when the endpoint is supported but not required) instead of 2 preferences. It simplifies the decision matrix logic below, and clients can then use their own heuristics.

I think that's what the decision as to if preferences or not makes sense, comes down to:
is it better to have clients just make intelligent choices when server side planning is available but not required, or is it better for servers to indicate preferences. My thought process is if a server really feels like it's advantageous to do remote planning, may as well just send it back as required.

I should note: I'm not super opinionated on this, but I do think it'd be great if we could outline some concrete cases where we think a preference is advantageous (in both directions) just to make it clear if the complexity is worth it.

* side.
* <li>CLIENT_PREFERRED (default) - Prefer client-side planning but flexible.
* <li>CATALOG_PREFERRED - Prefer server-side planning but flexible. Falls back to client if
* server doesn't support planning endpoints.
* <li>CATALOG_ONLY - MUST use server-side planning. Requires server support. Fails if paired
* with CLIENT_ONLY from other side.
* </ul>
*
* <p>Negotiation rules when both sides are configured:
*
* <ul>
* <li><b>Incompatible</b>: CLIENT_ONLY + CATALOG_ONLY = FAIL
* <li><b>ONLY beats PREFERRED</b>: One "ONLY" + opposite "PREFERRED" = ONLY wins (inflexible
* beats flexible)
* <li><b>Both PREFERRED</b>: Different PREFERRED types = Client config wins
* <li><b>Both same</b>: Use that planning type
* <li><b>Only one configured</b>: Use the configured side
* </ul>
*/
public enum ScanPlanningMode {
CLIENT_ONLY("client-only"),
CLIENT_PREFERRED("client-preferred"),
CATALOG_PREFERRED("catalog-preferred"),
CATALOG_ONLY("catalog-only");

private final String modeName;

ScanPlanningMode(String modeName) {
this.modeName = modeName;
}

public String modeName() {
return modeName;
}

public boolean isClientOnly() {
return this == CLIENT_ONLY;
}

public boolean isCatalogOnly() {
return this == CATALOG_ONLY;
}

public boolean isOnly() {
return this == CLIENT_ONLY || this == CATALOG_ONLY;
}

public boolean isPreferred() {
return this == CLIENT_PREFERRED || this == CATALOG_PREFERRED;
}

public boolean prefersClient() {
return this == CLIENT_ONLY || this == CLIENT_PREFERRED;
}

public boolean prefersCatalog() {
return this == CATALOG_ONLY || this == CATALOG_PREFERRED;
}

public static ScanPlanningMode fromString(String mode) {
if (mode == null) {
return CLIENT_PREFERRED;
}
for (ScanPlanningMode planningMode : values()) {
if (planningMode.modeName.equalsIgnoreCase(mode)) {
return planningMode;
}
}
String validModes =
Arrays.stream(values()).map(ScanPlanningMode::modeName).collect(Collectors.joining(", "));
throw new IllegalArgumentException(
String.format("Invalid scan planning mode: %s. Valid values are: %s", mode, validModes));
}
}
}
66 changes: 54 additions & 12 deletions core/src/main/java/org/apache/iceberg/rest/RESTSessionCatalog.java
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,8 @@ public class RESTSessionCatalog extends BaseViewSessionCatalog
private MetricsReporter reporter = null;
private boolean reportingViaRestEnabled;
private Integer pageSize = null;
private boolean restScanPlanningEnabled;
private RESTCatalogProperties.ScanPlanningMode clientConfiguredScanPlanningMode;
private RESTCatalogProperties.ScanPlanningMode catalogLevelScanPlanningMode;
private CloseableGroup closeables = null;
private Set<Endpoint> endpoints;
private Supplier<Map<String, String>> mutationHeaders = Map::of;
Expand Down Expand Up @@ -272,11 +273,24 @@ public void initialize(String name, Map<String, String> unresolved) {
RESTCatalogProperties.NAMESPACE_SEPARATOR,
RESTUtil.NAMESPACE_SEPARATOR_URLENCODED_UTF_8);

this.restScanPlanningEnabled =
PropertyUtil.propertyAsBoolean(
mergedProps,
RESTCatalogProperties.REST_SCAN_PLANNING_ENABLED,
RESTCatalogProperties.REST_SCAN_PLANNING_ENABLED_DEFAULT);
// Client-configured scan planning mode (from catalog properties, not from server config)
// Read from un-merged properties to avoid picking up server-provided defaults
String clientScanPlanningConfig =
PropertyUtil.propertyAsString(props, RESTCatalogProperties.SCAN_PLANNING_MODE, null);
this.clientConfiguredScanPlanningMode =
clientScanPlanningConfig != null
? RESTCatalogProperties.ScanPlanningMode.fromString(clientScanPlanningConfig)
: null;

// Also store the catalog-level (possibly server-provided) scan planning mode as a fallback
// This comes from ConfigResponse.overrides() and gets merged into mergedProps
String catalogLevelConfig =
PropertyUtil.propertyAsString(mergedProps, RESTCatalogProperties.SCAN_PLANNING_MODE, null);
this.catalogLevelScanPlanningMode =
catalogLevelConfig != null
? RESTCatalogProperties.ScanPlanningMode.fromString(catalogLevelConfig)
: null;

super.initialize(name, mergedProps);
}

Expand Down Expand Up @@ -486,7 +500,7 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
// RestTable should only be returned for non-metadata tables, because client would
// not have access to metadata files for example manifests, since all it needs is catalog.
if (metadataType == null) {
RESTTable restTable = restTableForScanPlanning(ops, finalIdentifier, tableClient);
RESTTable restTable = restTableForScanPlanning(ops, finalIdentifier, tableClient, tableConf);
if (restTable != null) {
return restTable;
}
Expand All @@ -505,9 +519,35 @@ public Table loadTable(SessionContext context, TableIdentifier identifier) {
}

private RESTTable restTableForScanPlanning(
TableOperations ops, TableIdentifier finalIdentifier, RESTClient restClient) {
// server supports remote planning endpoint and server / client wants to do server side planning
if (endpoints.contains(Endpoint.V1_SUBMIT_TABLE_SCAN_PLAN) && restScanPlanningEnabled) {
TableOperations ops,
TableIdentifier finalIdentifier,
RESTClient restClient,
Map<String, String> tableConf) {
// Get client-configured mode (set in catalog properties during initialization)
RESTCatalogProperties.ScanPlanningMode clientMode = clientConfiguredScanPlanningMode;

// Get server-provided mode
// Priority: table-level config > catalog-level config (from ConfigResponse)
String tableLevelModeConfig = tableConf.get(RESTCatalogProperties.SCAN_PLANNING_MODE);
RESTCatalogProperties.ScanPlanningMode serverMode;
if (tableLevelModeConfig != null) {
serverMode = RESTCatalogProperties.ScanPlanningMode.fromString(tableLevelModeConfig);
} else {
// Fall back to catalog-level server config (from ConfigResponse.overrides())
serverMode = catalogLevelScanPlanningMode;
}

// Check server capabilities
boolean serverSupportsPlanning = endpoints.contains(Endpoint.V1_SUBMIT_TABLE_SCAN_PLAN);

// Negotiate scan planning strategy
// Rules: ONLY beats PREFERRED, both PREFERRED = client wins, one side only = use it
ScanPlanningNegotiator.PlanningDecision decision =
ScanPlanningNegotiator.negotiate(
clientMode, serverMode, serverSupportsPlanning, finalIdentifier);

// Apply the decision
if (decision == ScanPlanningNegotiator.PlanningDecision.USE_CATALOG_PLANNING) {
return new RESTTable(
ops,
fullTableName(finalIdentifier),
Expand All @@ -518,6 +558,8 @@ private RESTTable restTableForScanPlanning(
paths,
endpoints);
}

// USE_CLIENT_PLANNING
return null;
}

Expand Down Expand Up @@ -589,7 +631,7 @@ public Table registerTable(

trackFileIO(ops);

RESTTable restTable = restTableForScanPlanning(ops, ident, tableClient);
RESTTable restTable = restTableForScanPlanning(ops, ident, tableClient, tableConf);
if (restTable != null) {
return restTable;
}
Expand Down Expand Up @@ -858,7 +900,7 @@ public Table create() {

trackFileIO(ops);

RESTTable restTable = restTableForScanPlanning(ops, ident, tableClient);
RESTTable restTable = restTableForScanPlanning(ops, ident, tableClient, tableConf);
if (restTable != null) {
return restTable;
}
Expand Down
Loading