TELCODOCS-2644 clarify the NUMA aware scheduler scoring behavior#106102
TELCODOCS-2644 clarify the NUMA aware scheduler scoring behavior#106102kquinn1204 wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
🤖 Mon Feb 09 17:49:26 - Prow CI generated the docs preview: |
| [id="cnf-balanceallocated-example_{context}"] | ||
| == BalancedAllocation strategy example | ||
| The `BalancedAllocation` strategy assigns workloads to the NUMA node with the most balanced resource utilization across CPU and memory. The goal is to prevent imbalanced usage, such as high CPU utilization with underutilized memory. Assume a worker node has the following NUMA node states: | ||
| The `BalancedAllocation` strategy favors worker nodes that exhibit the most balanced resource utilization (CPU versus Memory) within their NUMA zones. This prevents _skewed_ usage where a node might be out of CPU cycles while having massive amounts of idle memory. |
There was a problem hiding this comment.
🤖 [error] RedHat.TermsErrors: Use 'compared to' rather than 'versus'. For more information, see RedHat.TermsErrors.
| . *Node Selection*: The scheduler first selects a suitable worker node based on cluster-wide criteria. For example taints, labels, or resource availability. | ||
|
|
||
| . After a worker node is selected, the scheduler evaluates its NUMA nodes and applies a scoring strategy to decide which NUMA node will handle the workload. | ||
| . *NUMA-Aware Scoring*: After a worker node is selected, the scheduler evaluates the available resources within each worker node's NUMA zones. It applies a scoring strategy to select the worker node that best fits the desired resource distribution. |
There was a problem hiding this comment.
This is confusing, because the Node selection talks about a singular selected node.
Change it so it says Node filtering and describe that it keeps only nodes that are suitable based on what you have + NUMA zone available resources.
The second step is then Node selection and that one uses the score to pick the "best" node out of the shortlist based on the strategy.
| . After a workload is scheduled, the selected NUMA node’s resources are updated to reflect the allocation. | ||
|
|
||
| The default strategy applied is the `LeastAllocated` strategy. This assigns workloads to the NUMA node with the most available resources that is the least utilized NUMA node. The goal of this strategy is to spread workloads across NUMA nodes to reduce contention and avoid hotspots. | ||
| . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection. |
There was a problem hiding this comment.
Maybe
| . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection. | |
| . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (CPU and Topology Managers) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection. |
CPU manager picks the CPUs based on input from all the others like Topology manager, Memory manager (for hugepages affinity) and other device managers (eg. for SR-IOV affinity).
| The default strategy applied is the `LeastAllocated` strategy. This assigns workloads to the NUMA node with the most available resources that is the least utilized NUMA node. The goal of this strategy is to spread workloads across NUMA nodes to reduce contention and avoid hotspots. | ||
| . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection. | ||
|
|
||
| The following table summarizes the different strategies and their outcomes: |
There was a problem hiding this comment.
| The following table summarizes the different strategies and their outcomes: | |
| The following table summarizes the different OpenShift Node selection strategies and their outcomes: |
a936368 to
1569e2a
Compare
|
LGTM with this caveat: when you write "(CPU and Topology Managers)" I'd either use "Topology and resource managers" to group the active managers, or spell them all out: "(CPU, Memory, Device and Topology managers)". I'm fine with both approaches. |
02d06d0 to
1aafec5
Compare
|
@kquinn1204: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[TELCODOCS-2644]: Clarify the NUMA aware scheduler scoring behavior
Version(s): 4.18, 4.19, 4.20, 4.21 and main
Issue: https://issues.redhat.com/browse/TELCODOCS-2644
Link to docs preview: https://106102--ocpdocs-pr.netlify.app/openshift-enterprise/latest/scalability_and_performance/cnf-numa-aware-scheduling.html#cnf-numa-resource-scheduling-strategies_numa-aware
QE review:
Additional information: