Add TableConfigValidator and InstanceConfigValidator SPI for batch restart enforcement#18167
Conversation
…estart enforcement Introduces validation SPI interfaces in pinot-spi for intercepting table and instance config mutations before persistence. This enables StarTree to enforce batch restart invariants (pool tags, replica groups) at mutation time rather than only at restart time. Changes: - TableConfigValidator/InstanceConfigValidator interfaces in pinot-spi - List-based registries (CopyOnWriteArrayList, first-rejection short-circuits) - ConfigValidationException for rejection signaling (maps to HTTP 400) - InstanceUtils.toInstance() reverse converter (InstanceConfig -> Instance) - Validator call sites in PinotTableRestletResource, PinotHelixResourceManager, PinotInstanceRestletResource, TableConfigsRestletResource - Unit tests for registries, toInstance() roundtrip, and resource manager validation wiring with mocked Helix dependencies
…onUtils Consolidate SPI validation into TableConfigValidationUtils.validateTableConfig() so every caller gets it automatically. This also adds the missing validation to the copyTable endpoint, which previously had a TODO placeholder.
There was a problem hiding this comment.
Pull request overview
Adds SPI hooks to validate table and instance configuration mutations before they’re persisted, enabling enforcement of batch restart / mutation rules and returning validation failures as client errors.
Changes:
- Introduces
TableConfigValidator/InstanceConfigValidatorSPIs and in-memory registries inpinot-spi. - Adds
ConfigValidationExceptionfor validator rejections and wires validator invocations into controller mutation paths. - Adds
InstanceUtils.toInstance(InstanceConfig)to reconstructInstanceobjects for validation (notably forupdateInstanceTags) and adds unit tests for the new behavior.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfigValidator.java | New SPI interface for table config mutation validation. |
| pinot-spi/src/main/java/org/apache/pinot/spi/config/table/TableConfigValidatorRegistry.java | Registry for table validators; invoked before persistence. |
| pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceConfigValidator.java | New SPI interface for instance mutation validation. |
| pinot-spi/src/main/java/org/apache/pinot/spi/config/instance/InstanceConfigValidatorRegistry.java | Registry for instance validators; invoked before persistence. |
| pinot-spi/src/main/java/org/apache/pinot/spi/exception/ConfigValidationException.java | Exception type used by validators to reject mutations. |
| pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java | Calls instance validators for add/update/updateTags prior to Helix persistence. |
| pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/TableConfigValidationUtils.java | Calls table validators as part of existing table config validation flow. |
| pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/TableConfigsRestletResource.java | Calls table validators when validating composite TableConfigs. |
| pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java | Adds table config validation on the copy-table flow. |
| pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java | Maps ConfigValidationException to HTTP 400 for instance mutation endpoints. |
| pinot-common/src/main/java/org/apache/pinot/common/utils/config/InstanceUtils.java | Adds toInstance(InstanceConfig) reverse conversion for validation usage. |
| pinot-common/src/test/java/org/apache/pinot/common/utils/config/InstanceUtilsTest.java | Tests round-trip fidelity and unknown-prefix handling for toInstance. |
| pinot-spi/src/test/java/org/apache/pinot/spi/config/table/TableConfigValidatorRegistryTest.java | Tests validator registry semantics (ordering, short-circuit, reset). |
| pinot-spi/src/test/java/org/apache/pinot/spi/config/instance/InstanceConfigValidatorRegistryTest.java | Tests validator registry semantics (ordering, short-circuit, reset). |
| pinot-controller/src/test/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManagerConfigValidationTest.java | Verifies validators reject before persistence and allow persistence when passing. |
| TableConfigValidationUtils.validateTableConfig( | ||
| realtimeTableConfig, schema, null, _pinotHelixResourceManager, _controllerConf, _pinotTaskManager); |
There was a problem hiding this comment.
copyTable() now calls TableConfigValidationUtils.validateTableConfig(...), which can throw ConfigValidationException for a user-supplied config. The method currently catches the exception under the generic catch (Exception) and returns HTTP 500, which contradicts the intended “mapped to 400 at the REST boundary” behavior and will look like a server error to clients. Handle ConfigValidationException (and/or wrap it) as a BAD_REQUEST response here, consistent with other table mutation endpoints.
| // Read host/port from ZNRecord simple fields (source of truth, updated by updateHelixInstanceConfig) | ||
| String host = instanceConfig.getHostName(); | ||
| int port = Integer.parseInt(instanceConfig.getPort()); | ||
|
|
There was a problem hiding this comment.
toInstance() reads host directly from instanceConfig.getHostName(). InstanceUtils.getServerAdminEndpoint() already documents/handles a legacy case where the hostname can be stored as Server_<hostname>. If such legacy configs exist, toInstance() will reconstruct an Instance with an incorrect host value, which can cause validators (and any downstream logic) to behave incorrectly. Consider applying the same backward-compatible stripping when type == SERVER (and potentially other types if applicable), or otherwise normalizing the host field before constructing the Instance.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #18167 +/- ##
=========================================
Coverage 63.13% 63.13%
Complexity 1616 1616
=========================================
Files 3213 3216 +3
Lines 195730 195795 +65
Branches 30240 30246 +6
=========================================
+ Hits 123569 123621 +52
- Misses 62288 62298 +10
- Partials 9873 9876 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xiangfu0
left a comment
There was a problem hiding this comment.
High-signal issues inline.
| } | ||
| } | ||
| TaskConfigUtils.validateTaskConfigs(tableConfigs.getOffline(), schema, _pinotTaskManager, typesToSkip); | ||
| TableConfigValidatorRegistry.validate(offlineTableConfig, schema); |
There was a problem hiding this comment.
This validator still runs inside validateConfig(), but addConfig()/updateConfig() call tuneConfig() only afterwards. That means the SPI is evaluating a different TableConfig than the one that is actually persisted, unlike the /tables endpoints where tuning happens before validation. A validator can therefore reject or allow the same mutation depending on which API the caller uses. If this SPI is meant to protect the stored config, it needs to run after tuning or in PinotHelixResourceManager.
| } | ||
|
|
||
| // Read host/port from ZNRecord simple fields (source of truth, updated by updateHelixInstanceConfig) | ||
| String host = instanceConfig.getHostName(); |
There was a problem hiding this comment.
Pinot still carries legacy server InstanceConfigs whose hostName is stored as Server_ (the codebase already has compatibility handling for that form). toInstance() returns that prefixed hostname verbatim, so validators reached from updateInstanceTags() will observe host=Server_localhost and any recomputed Helix id becomes Server_Server_localhost_. On upgraded clusters this breaks tag-update validation for older server instances unless the hostname is normalized here.
Summary
TableConfigValidatorandInstanceConfigValidatorSPI interfaces inpinot-spiwith list-based registries, enabling pre-mutation validation of table and instance configsInstanceUtils.toInstance(InstanceConfig)reverse converter for theupdateInstanceTagsvalidation pathPinotTableRestletResource,PinotHelixResourceManager,PinotInstanceRestletResource, andTableConfigsRestletResourceConfigValidationException(extendsRuntimeException) mapped to HTTP 400 at the REST boundaryTest plan
InstanceUtilsTest.testToInstance()— roundtrip fidelity for all 4 instance types + unknown type errorInstanceConfigValidatorRegistryTest— 5 tests: no-op when empty, rejection propagates, short-circuit, ordering, resetTableConfigValidatorRegistryTest— 5 tests: same coverage as abovePinotHelixResourceManagerConfigValidationTest— 4 mocked tests: addInstance/updateInstance/updateInstanceTags rejection before persistence, passing validator allows persistence