-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Introduce primaryTerm and inSynAllocationIds in IndexRoutingTable #20194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Introduce primaryTerm and inSynAllocationIds in IndexRoutingTable #20194
Conversation
Signed-off-by: Swetha Guptha <gupthasg@amazon.com>
WalkthroughIndexRoutingTable is extended to expose primary term tracking and in-sync allocation IDs through new constructor parameters, fields, accessor methods, and builder enhancements with validation. Serialization logic updated for version compatibility, and RoutingTableIncrementalDiff adjusted to pass the new parameters. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java (1)
970-988:build()should infernumberOfShardsfromshards.size()if not explicitly set.The builder's
numberOfShardsmust matchshards.size()for the validation and loop logic to work correctly. However, several code paths (e.g.,readFrom()) don't callsetNumberOfShards(). Consider inferring it from the actual shards map:public IndexRoutingTable build() { + // Infer numberOfShards from shards if not explicitly set + if (numberOfShards <= 0) { + numberOfShards = shards.size(); + } + if (primaryTerms == null) { initializePrimaryTerms(); } else if (primaryTerms.length != numberOfShards) {This ensures consistency between the shards map and the new fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java(12 hunks)server/src/main/java/org/opensearch/cluster/routing/RoutingTableIncrementalDiff.java(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, macos-15)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, macos-15)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: Validate
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: verify-changelog
- GitHub Check: Analyze (java)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: gradle-check
🔇 Additional comments (2)
server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java (2)
161-183: Validation logic looks correct and appropriately lenient.The validation only runs when fields are populated, which aligns with the PR objective of incremental adoption. The size/length checks against
indexMetadata.getNumberOfShards()are appropriate.
995-1004: Pretty print additions look good with appropriate null checks.
| this.primaryTerms = primaryTerms; | ||
| this.inSyncAllocationIds = Collections.unmodifiableMap(inSyncAllocationIds); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constructor does not handle null inSyncAllocationIds, causing NPE.
Line 127 calls Collections.unmodifiableMap(inSyncAllocationIds) without a null check. Since callers (e.g., RoutingTableIncrementalDiff.apply()) pass null, this will throw NullPointerException.
Apply this diff to handle nulls defensively:
this.allActiveShards = Collections.unmodifiableList(allActiveShards);
this.primaryTerms = primaryTerms;
- this.inSyncAllocationIds = Collections.unmodifiableMap(inSyncAllocationIds);
+ this.inSyncAllocationIds = inSyncAllocationIds != null
+ ? Collections.unmodifiableMap(inSyncAllocationIds)
+ : Collections.emptyMap();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this.primaryTerms = primaryTerms; | |
| this.inSyncAllocationIds = Collections.unmodifiableMap(inSyncAllocationIds); | |
| } | |
| this.primaryTerms = primaryTerms; | |
| this.inSyncAllocationIds = inSyncAllocationIds != null | |
| ? Collections.unmodifiableMap(inSyncAllocationIds) | |
| : Collections.emptyMap(); | |
| } |
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 126-128, the constructor calls
Collections.unmodifiableMap(inSyncAllocationIds) without checking for null which
causes a NPE when callers pass null; change the assignment to defensively handle
null by using an empty map when inSyncAllocationIds is null (e.g., set
this.inSyncAllocationIds = Collections.unmodifiableMap(inSyncAllocationIds ==
null ? Collections.emptyMap() : inSyncAllocationIds)), so the field is never
null and remains unmodifiable.
| public long getPrimaryTerm(int shardId) { | ||
| return this.primaryTerms[shardId]; | ||
| } | ||
|
|
||
| public long[] getPrimaryTerms() { | ||
| return this.primaryTerms.clone(); | ||
| } | ||
|
|
||
| public Set<String> getInSyncAllocationIds(int shardId) { | ||
| assert shardId >= 0 && shardId < inSyncAllocationIds.size(); | ||
| return inSyncAllocationIds.get(shardId); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessors are not null-safe for unpopulated fields.
Per the PR description, primaryTerms and inSyncAllocationIds may be null/empty by default. However, these getters will throw NullPointerException when the fields are not populated:
getPrimaryTerm(int)accessesprimaryTerms[shardId]without null checkgetPrimaryTerms()callsprimaryTerms.clone()without null checkgetInSyncAllocationIds(int)accesses the map without null check
Consider adding null checks and returning sensible defaults, or document that callers must check getPrimaryTerms() != null first. Example:
public long getPrimaryTerm(int shardId) {
+ if (primaryTerms == null) {
+ throw new IllegalStateException("primaryTerms not populated for index " + index);
+ }
return this.primaryTerms[shardId];
}
public long[] getPrimaryTerms() {
+ return this.primaryTerms != null ? this.primaryTerms.clone() : null;
- return this.primaryTerms.clone();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public long getPrimaryTerm(int shardId) { | |
| return this.primaryTerms[shardId]; | |
| } | |
| public long[] getPrimaryTerms() { | |
| return this.primaryTerms.clone(); | |
| } | |
| public Set<String> getInSyncAllocationIds(int shardId) { | |
| assert shardId >= 0 && shardId < inSyncAllocationIds.size(); | |
| return inSyncAllocationIds.get(shardId); | |
| } | |
| public long getPrimaryTerm(int shardId) { | |
| if (primaryTerms == null) { | |
| throw new IllegalStateException("primaryTerms not populated for index " + index); | |
| } | |
| return this.primaryTerms[shardId]; | |
| } | |
| public long[] getPrimaryTerms() { | |
| return this.primaryTerms != null ? this.primaryTerms.clone() : null; | |
| } | |
| public Set<String> getInSyncAllocationIds(int shardId) { | |
| assert shardId >= 0 && shardId < inSyncAllocationIds.size(); | |
| return inSyncAllocationIds.get(shardId); | |
| } |
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 311-322, the accessors assume primaryTerms and inSyncAllocationIds
are non-null and will NPE if unpopulated; update the three getters to be
null-safe by checking for null and returning sensible defaults (for
getPrimaryTerm(int) return 0L or throw a controlled
IndexOutOfBounds/IllegalState with clear message if shardId is invalid or
primaryTerms is null, for getPrimaryTerms() return an empty long[] when
primaryTerms is null, and for getInSyncAllocationIds(int) return an empty Set or
Collections.emptySet() when inSyncAllocationIds is null), and ensure existing
assertions/validation of shardId remain; keep behavior consistent with class
invariants and document any change in Javadoc if necessary.
| if (!Arrays.equals(primaryTerms, that.primaryTerms)) return false; | ||
| if (!inSyncAllocationIds.equals(that.inSyncAllocationIds)) return false; | ||
| return true; | ||
| } | ||
|
|
||
| @Override | ||
| public int hashCode() { | ||
| int result = index.hashCode(); | ||
| result = 31 * result + shards.hashCode(); | ||
| result = 31 * result + Arrays.hashCode(primaryTerms); | ||
| result = 31 * result + inSyncAllocationIds.hashCode(); | ||
| return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equals() and hashCode() will NPE if inSyncAllocationIds is null.
Line 425 calls inSyncAllocationIds.equals() and line 434 calls inSyncAllocationIds.hashCode() without null checks. Use Objects.equals() and Objects.hashCode() for null safety.
- if (!Arrays.equals(primaryTerms, that.primaryTerms)) return false;
- if (!inSyncAllocationIds.equals(that.inSyncAllocationIds)) return false;
+ if (!Arrays.equals(primaryTerms, that.primaryTerms)) return false;
+ if (!Objects.equals(inSyncAllocationIds, that.inSyncAllocationIds)) return false; result = 31 * result + Arrays.hashCode(primaryTerms);
- result = 31 * result + inSyncAllocationIds.hashCode();
+ result = 31 * result + Objects.hashCode(inSyncAllocationIds);This requires adding import java.util.Objects;.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 424 to 435, equals() and hashCode() call
inSyncAllocationIds.equals(...) and inSyncAllocationIds.hashCode() without null
checks which will NPE if inSyncAllocationIds is null; replace those direct calls
with Objects.equals(inSyncAllocationIds, that.inSyncAllocationIds) in equals()
and Objects.hashCode(inSyncAllocationIds) (or include it in Objects.hash(...)
for the overall hash) in hashCode(), and add the import java.util.Objects; to
the file.
| if (in.getVersion().onOrAfter(Version.V_3_4_0)) { | ||
| builder.setPrimaryTerms(in.readLongArray()); | ||
| int inSyncAllocationIdsSize = in.readVInt(); | ||
| for (int i = 0; i < inSyncAllocationIdsSize; i++) { | ||
| int shardId = in.readVInt(); | ||
| Set<String> allocationIds = DiffableUtils.StringSetValueSerializer.getInstance().read(in, shardId); | ||
| builder.setInSyncAllocationIds(shardId, allocationIds); | ||
| } | ||
| } | ||
|
|
||
| return builder.build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserialization will fail: numberOfShards not set before build() is called.
readFrom() calls builder.setPrimaryTerms() with the deserialized array, but never calls setNumberOfShards(). When build() executes, it checks primaryTerms.length != numberOfShards (line 973), but numberOfShards is still 0 (default). This will throw IllegalStateException for any index with shards.
Apply this diff to set numberOfShards before build:
if (in.getVersion().onOrAfter(Version.V_3_4_0)) {
builder.setPrimaryTerms(in.readLongArray());
int inSyncAllocationIdsSize = in.readVInt();
for (int i = 0; i < inSyncAllocationIdsSize; i++) {
int shardId = in.readVInt();
Set<String> allocationIds = DiffableUtils.StringSetValueSerializer.getInstance().read(in, shardId);
builder.setInSyncAllocationIds(shardId, allocationIds);
}
}
+ builder.setNumberOfShards(size);
return builder.build();Alternatively, infer numberOfShards from shards.size() in build() if not explicitly set.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 462 to 472, ensure numberOfShards is set before calling
builder.build(): after reading the primary terms array from the stream, call
builder.setNumberOfShards(primaryTerms.length) (or otherwise set numberOfShards
to the appropriate value) so build() does not throw when it compares
primaryTerms.length to numberOfShards; alternatively, modify build() to infer
numberOfShards from shards.size() when numberOfShards is unset, but the primary
fix is to set numberOfShards from the deserialized primary terms before
returning builder.build().
| if (out.getVersion().onOrAfter(Version.V_3_4_0)) { | ||
| out.writeVLongArray(primaryTerms); | ||
| out.writeVInt(inSyncAllocationIds.size()); | ||
| for (final Map.Entry<Integer, Set<String>> cursor : inSyncAllocationIds.entrySet()) { | ||
| out.writeVInt(cursor.getKey()); | ||
| DiffableUtils.StringSetValueSerializer.getInstance().write(cursor.getValue(), out); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeTo can throw NPE when serializing to V_3_4_0+ if fields are null.
If an IndexRoutingTable was constructed with null primaryTerms or inSyncAllocationIds (e.g., from incremental diff or older version deserialization), serializing it to a V_3_4_0+ node will throw NullPointerException at out.writeVLongArray(primaryTerms).
Either ensure constructor always initializes non-null values, or add null checks in serialization:
if (out.getVersion().onOrAfter(Version.V_3_4_0)) {
- out.writeVLongArray(primaryTerms);
- out.writeVInt(inSyncAllocationIds.size());
+ out.writeVLongArray(primaryTerms != null ? primaryTerms : new long[0]);
+ Map<Integer, Set<String>> ids = inSyncAllocationIds != null ? inSyncAllocationIds : Collections.emptyMap();
+ out.writeVInt(ids.size());
- for (final Map.Entry<Integer, Set<String>> cursor : inSyncAllocationIds.entrySet()) {
+ for (final Map.Entry<Integer, Set<String>> cursor : ids.entrySet()) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (out.getVersion().onOrAfter(Version.V_3_4_0)) { | |
| out.writeVLongArray(primaryTerms); | |
| out.writeVInt(inSyncAllocationIds.size()); | |
| for (final Map.Entry<Integer, Set<String>> cursor : inSyncAllocationIds.entrySet()) { | |
| out.writeVInt(cursor.getKey()); | |
| DiffableUtils.StringSetValueSerializer.getInstance().write(cursor.getValue(), out); | |
| } | |
| } | |
| if (out.getVersion().onOrAfter(Version.V_3_4_0)) { | |
| out.writeVLongArray(primaryTerms != null ? primaryTerms : new long[0]); | |
| Map<Integer, Set<String>> ids = inSyncAllocationIds != null ? inSyncAllocationIds : Collections.emptyMap(); | |
| out.writeVInt(ids.size()); | |
| for (final Map.Entry<Integer, Set<String>> cursor : ids.entrySet()) { | |
| out.writeVInt(cursor.getKey()); | |
| DiffableUtils.StringSetValueSerializer.getInstance().write(cursor.getValue(), out); | |
| } | |
| } |
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 486-493, the writeTo method may NPE when serializing to
Version.V_3_4_0+ if primaryTerms or inSyncAllocationIds are null; add null
checks and defensive defaults before writing: if primaryTerms is null write an
empty VLong array (and avoid calling writeVLongArray on null), and if
inSyncAllocationIds is null write a size of 0 and skip the loop (or initialize
them to empty collections at construction/deserialization); update the
serialization branch to handle nulls safely so no NPE occurs during writeTo.
| private long[] primaryTerms = null;; | ||
| private final Map<Integer, Set<String>> inSyncAllocationIds; | ||
| private int numberOfShards; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Double semicolon on line 524.
- private long[] primaryTerms = null;;
+ private long[] primaryTerms = null;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private long[] primaryTerms = null;; | |
| private final Map<Integer, Set<String>> inSyncAllocationIds; | |
| private int numberOfShards; | |
| private long[] primaryTerms = null; | |
| private final Map<Integer, Set<String>> inSyncAllocationIds; | |
| private int numberOfShards; |
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 524 to 526, there's a stray double semicolon after the declaration
of primaryTerms ("private long[] primaryTerms = null;;"); remove the extra
semicolon so the line reads with a single terminating semicolon, then
reformat/compile to ensure no lint or compile issues remain.
| private void initializePrimaryTerms() { | ||
| assert primaryTerms == null; | ||
| if (numberOfShards < 0) { | ||
| throw new IllegalStateException("you must set the number of shards before setting/reading primary terms"); | ||
| } | ||
| primaryTerms = new long[numberOfShards]; | ||
| Arrays.fill(primaryTerms, SequenceNumbers.UNASSIGNED_PRIMARY_TERM); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numberOfShards < 0 check is ineffective; numberOfShards defaults to 0.
The check at line 555 intends to detect if numberOfShards was not set, but since int defaults to 0 (not negative), this condition will never trigger naturally. When numberOfShards is 0, initializePrimaryTerms() will create an empty array, which may cause subtle bugs downstream.
Initialize numberOfShards to -1 to properly detect the "not set" state:
- private int numberOfShards;
+ private int numberOfShards = -1;Or change the check:
- if (numberOfShards < 0) {
+ if (numberOfShards <= 0) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private void initializePrimaryTerms() { | |
| assert primaryTerms == null; | |
| if (numberOfShards < 0) { | |
| throw new IllegalStateException("you must set the number of shards before setting/reading primary terms"); | |
| } | |
| primaryTerms = new long[numberOfShards]; | |
| Arrays.fill(primaryTerms, SequenceNumbers.UNASSIGNED_PRIMARY_TERM); | |
| } | |
| private void initializePrimaryTerms() { | |
| assert primaryTerms == null; | |
| if (numberOfShards <= 0) { | |
| throw new IllegalStateException("you must set the number of shards before setting/reading primary terms"); | |
| } | |
| primaryTerms = new long[numberOfShards]; | |
| Arrays.fill(primaryTerms, SequenceNumbers.UNASSIGNED_PRIMARY_TERM); | |
| } |
🤖 Prompt for AI Agents
In server/src/main/java/org/opensearch/cluster/routing/IndexRoutingTable.java
around lines 553 to 560, the guard that checks "numberOfShards < 0" is
ineffective because numberOfShards defaults to 0, so initialize detection of an
unset value by either (a) set numberOfShards' default to -1 where it's declared
so the existing check works, or (b) change the condition here to "numberOfShards
<= 0" (or "<= 0" if zero shards should be rejected) to treat 0 as an
invalid/unset state before allocating the primaryTerms array; pick one approach
and update the declaration or the check accordingly and add a brief comment
explaining the chosen sentinel behavior.
| public IndexRoutingTable apply(IndexRoutingTable part) { | ||
| return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards())); | ||
| // TODO: fix this for remote publication | ||
| return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards()), null, null); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Passing null for inSyncAllocationIds will cause NPE in IndexRoutingTable constructor.
The IndexRoutingTable constructor at line 127 calls Collections.unmodifiableMap(inSyncAllocationIds), which throws NullPointerException when null is passed. This breaks the incremental diff apply path.
Apply this diff to pass empty collections instead of nulls:
@Override
public IndexRoutingTable apply(IndexRoutingTable part) {
// TODO: fix this for remote publication
- return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards()), null, null);
+ return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards()), part.getPrimaryTerms(), part.getInSyncAllocationIds());
}Alternatively, if propagating the values from the existing part is not intended, the constructor should handle null inputs gracefully.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public IndexRoutingTable apply(IndexRoutingTable part) { | |
| return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards())); | |
| // TODO: fix this for remote publication | |
| return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards()), null, null); | |
| } | |
| public IndexRoutingTable apply(IndexRoutingTable part) { | |
| // TODO: fix this for remote publication | |
| return new IndexRoutingTable(index, indexShardRoutingTables.apply(part.getShards()), part.getPrimaryTerms(), part.getInSyncAllocationIds()); | |
| } |
|
❌ Gradle check result for 61b2c5c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
First PR for #20062: Introduce primaryTerms and inSyncAllocationIds fields in IndexRoutingTable to make IndexRoutingTable the source of truth for these fields instead of IndexMetadata.
Changes
Not Included
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
[ ] API changes companion pull request created, if applicable.[ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.