-
Notifications
You must be signed in to change notification settings - Fork 581
HDDS-14110. [DiskBalancer] Show EstimatedBytesToMove only during active balancing and improve threshold check message #9465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: HDDS-5713
Are you sure you want to change the base?
Conversation
…eshold error msg improve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses two issues in the DiskBalancer service: (1) preventing the display of EstimatedBytesToMove and EstimatedTimeLeft when no container movement is occurring, and (2) improving threshold validation to exclude boundary values 0 and 100 with a clearer error message.
Key Changes:
- Updated threshold validation to exclude 0 and 100 (changed from
< 0dto<= 0d), preventing edge cases that would cause excessive or meaningless balancing - Modified
getDiskBalancerInfo()to only calculate and reportbytesToMovewhen containers are actively being balanced (RUNNINGstate AND non-emptyinProgressContainers) - Enhanced error message to clarify that the threshold range is exclusive: "Threshold must be a percentage(double) in the range 0 to 100 both exclusive."
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerConfiguration.java |
Updated threshold validation to exclude 0 and 100, and improved error message clarity |
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerService.java |
Added check for non-empty inProgressContainers before calculating bytesToMove and updated comments |
hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/diskbalancer/TestDiskBalancerService.java |
Added test coverage for the new inProgressContainers check in getDiskBalancerInfo() |
Comments suppressed due to low confidence (1)
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/diskbalancer/DiskBalancerService.java:724
- getInProgressContainers exposes the internal representation stored in field inProgressContainers. The value may be modified after this call to getInProgressContainers.
public Set<ContainerID> getInProgressContainers() {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (threshold <= 0d || threshold >= 100d) { | ||
| throw new IllegalArgumentException( | ||
| "Threshold must be a percentage(double) in the range 0 to 100."); | ||
| "Threshold must be a percentage(double) in the range 0 to 100 both exclusive."); | ||
| } |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The threshold validation logic has been changed to exclude both 0 and 100 (using <= 0d instead of < 0d). However, there are no tests validating this boundary condition. Consider adding test cases to verify that:
- Threshold values of 0 and 100 are rejected with the appropriate error message
- Valid threshold values like 0.001 and 99.999 are accepted
This will ensure the validation logic works correctly and prevent regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
What changes were proposed in this pull request?
A threshold of 0 means any deviation from ideal usage (even 0.01%) triggers
container movement
This leads to excessive and continuous balancing operations and results in unnecessary I/O overhead and resource consumption
A Threshold value can never be 100.0% as it would mean allow moving 100% of a disk's contents, effectively emptying one disk.
Suggested improvement:
Rather the error message should clarify that 0 and 100 is excluded. The validation is being updated to exclude 0, requiring threshold to be in
the range (0, 100) exclusive.
new error msg:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14110
How was this patch tested?
Added check for estimatedBytes in unit test
TestDiskBalancerService.Tested manually:
before patch:
After code chnages output fixed:
Threshold error output: