Skip to content

[bugfix] scope cluster-aware tableConfigs validations to validate/tune endpoints only#18580

Open
shounakmk219 wants to merge 1 commit into
apache:masterfrom
shounakmk219:skip-task-check-in-update
Open

[bugfix] scope cluster-aware tableConfigs validations to validate/tune endpoints only#18580
shounakmk219 wants to merge 1 commit into
apache:masterfrom
shounakmk219:skip-task-check-in-update

Conversation

@shounakmk219
Copy link
Copy Markdown
Collaborator

Summary

Fixes regressions introduced by #16675 in TableConfigsRestletResource.

That PR added three new cluster-aware validation types — TENANT, MINION_INSTANCES, ACTIVE_TASKS — to enhance the /tableConfigs/validate endpoint (its stated intent). However, it wired them into the shared private validateConfig(TableConfigs, String, String) helper, which is also called from the create (POST /tableConfigs) and update (PUT /tableConfigs/{tableName}) endpoints. This leaked the new checks onto paths where they don't belong.

Regressions observed

# Where Description
R1 (critical) updateConfig (PUT /tableConfigs/{tableName}) tableTasksValidation now runs on every update; it throws if any task exists for the table, so any config update is blocked while tasks are in the queue. The active-task check is only intended for create (catches stale state) and delete (blocks deleting with running tasks). The sibling PinotTableRestletResource.updateTableConfig at PUT /tables/{tableName} correctly does not gate updates this way.
R2 addConfig (POST /tableConfigs) tableTasksValidation is called twice for offline / realtime configs — once via the new block in the shared helper and again at the inline call sites in addConfig.
R3 addConfig The pre-existing @QueryParam("ignoreActiveTasks") flag is bypassed by the new call site; the helper gates only on the ACTIVE_TASKS skip type, so ignoreActiveTasks=true no longer suppresses the active-task check by itself — clients that relied on it broke.
R4 addConfig validateTableTenantConfig + validateTableTaskMinionInstanceTagConfig now run twice — once via the new helper, and again inside PinotHelixResourceManager.addTable(...).
R5 updateConfig Same double-call as R4 — once via the new helper, and again inside PinotHelixResourceManager.updateTableConfig(...).

R1–R3 are behavioral regressions affecting users; R4/R5 are duplicated Helix/ZK calls per request.

Fix

Keep the PR's intent (/tableConfigs/validate and /tableConfigs/tune surface cluster-aware issues for fail-fast feedback), but stop leaking the new checks onto create/update:

  1. Remove the cluster-aware blocks from the shared validateConfig(TableConfigs, ...) helper.
  2. Extract a new private helper validateClusterAwareConfig(TableConfig, Set<ValidationType>) that runs only the three cluster-aware checks, gated by skip types.
  3. Invoke the new helper only from parseAndValidateTableConfigs (the path shared by validateConfig and tuneConfig endpoints). The addConfig and updateConfig paths are not touched — they continue to rely on:
    • The existing inline tableTasksValidation call in addConfig (properly gated by ignoreActiveTasks).
    • PinotHelixResourceManager.addTable(...) and updateTableConfig(...) running tenant / minion checks internally (unchanged behavior, no double-call).
  4. Revert the @ApiParam doc on addConfig and updateConfig back to (ALL|TASK|UPSERT) since those endpoints don't consume the new skip types. The validate/tune endpoints keep the expanded doc.

Net effect: /tableConfigs/validate and /tableConfigs/tune behave identically to #16675. addConfig and updateConfig are restored to their pre-#16675 behavior.

Test plan

  • ./mvnw -pl pinot-controller -am compile — succeeds
  • ./mvnw spotless:apply checkstyle:check license:check -pl pinot-controller — all clean
  • PR Enhance /tableConfigs/validate endpoint with cluster-aware validations #16675's tests still pass: TableConfigsRestletResourceTest#testValidateConfigWithClusterValidationSkipTypes and testValidateConfigClusterValidationsEnabled (2 run, 0 failures)
  • Manual: create a table with taskConfig + running task → PUT /tableConfigs/{tableName} succeeds (pre-fix: blocked with "dangling task data")
  • Manual: POST /tableConfigs?ignoreActiveTasks=true succeeds without also needing validationTypesToSkip=ACTIVE_TASKS

🤖 Generated with Claude Code

…e endpoints only

PR apache#16675 placed the new TENANT/MINION_INSTANCES/ACTIVE_TASKS checks
inside the shared private validateConfig helper, so they leaked onto
the create and update paths. Move them into a new helper invoked only
from parseAndValidateTableConfigs (the validate/tune flow). This
restores update flow behavior (active-task check no longer blocks
updates), drops the duplicate tableTasksValidation/tenant/minion calls
on create, and keeps the PR's intent of fail-fast pre-flight feedback
on /tableConfigs/validate and /tableConfigs/tune intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 26, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.30%. Comparing base (3d62d6f) to head (3ee4885).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...ler/api/resources/TableConfigsRestletResource.java 75.00% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18580      +/-   ##
============================================
+ Coverage     64.28%   64.30%   +0.02%     
  Complexity     1137     1137              
============================================
  Files          3335     3335              
  Lines        205895   205903       +8     
  Branches      32129    32126       -3     
============================================
+ Hits         132359   132406      +47     
+ Misses        62888    62865      -23     
+ Partials      10648    10632      -16     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-21 64.30% <75.00%> (+0.02%) ⬆️
temurin 64.30% <75.00%> (+0.02%) ⬆️
unittests 64.30% <75.00%> (+0.02%) ⬆️
unittests1 56.79% <ø> (+0.04%) ⬆️
unittests2 36.85% <75.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants