HDDS-14870. Allow balancing of over replicated and quasi closed containers#9964
HDDS-14870. Allow balancing of over replicated and quasi closed containers#9964sarvekshayr wants to merge 3 commits intoapache:masterfrom
Conversation
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @sarvekshayr
I wanted to double check my understanding:
When includeNonStandardContainers is true, ContainerBalancerSelectionCriteria correctly allows:
- Case A: CLOSED + OVER_REPLICATED containers (with min CLOSED replicas + QUASI_CLOSED replicas)
- Case B: QUASI_CLOSED containers with all QUASI_CLOSED replicas
However, MoveManager.move() still enforces:
- Health must be HEALTHY – so OVER_REPLICATED is rejected with REPLICATION_NOT_HEALTHY_BEFORE_MOVE
- Container state must be CLOSED – so QUASI_CLOSED is rejected with REPLICATION_FAIL_CONTAINER_NOT_CLOSED
Since MoveManager doesn’t consider the config, it never sees includeNonStandardContainers. That would mean these containers get selected but fail when we actually try to move them.
Did you intend to update MoveManager as well to honor this config, or is there another path i am missing? I want to make sure i am not misreading the flow.
...va/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancerSelectionCriteria.java
Outdated
Show resolved
Hide resolved
| * @param replicas all replicas of the container | ||
| * @return true if container and replica are eligible for balancing, else false | ||
| */ | ||
| private boolean isContainerClosed(ContainerInfo container, |
There was a problem hiding this comment.
Lets have a separate method isContainerClosedRelaxed and use that if the config is true.
| LOG.debug("Excluding container {} with replicas {} as its health is {}.", container, replicas, state); | ||
| return false; | ||
| if (state == ContainerHealthResult.HealthState.HEALTHY) { | ||
| return true; |
| if (replicaState == ContainerReplicaProto.State.CLOSED) { | ||
| return true; | ||
| } | ||
|
|
There was a problem hiding this comment.
if replica state is quasi closed and replica is empty i.e bytesUsed == 0 we should exclude it
There was a problem hiding this comment.
Please add a test for this as well. 3 CLOSED 1 QC empty replica
What changes were proposed in this pull request?
Allow container balancer to include containers if
This is allowed only if the new config
hdds.container.balancer.include.non.standard.containersis set to true.What is the link to the Apache JIRA
HDDS-14870
How was this patch tested?
Added tests in
TestContainerBalancerSelectionCriteriaandTestMoveManager.