TELCODOCS-2644 clarify the NUMA aware scheduler scoring behavior#106102

Open

kquinn1204 wants to merge 1 commit intoopenshift:mainfrom

kquinn1204:TELCODOCS-2644

Contributor

kquinn1204 commented Feb 6, 2026 •

edited

Loading

[TELCODOCS-2644]: Clarify the NUMA aware scheduler scoring behavior

Version(s): 4.18, 4.19, 4.20, 4.21 and main

Issue: https://issues.redhat.com/browse/TELCODOCS-2644

Link to docs preview: https://106102--ocpdocs-pr.netlify.app/openshift-enterprise/latest/scalability_and_performance/cnf-numa-aware-scheduling.html#cnf-numa-resource-scheduling-strategies_numa-aware

QE review:

QE has approved this change.

Additional information:

openshift-ci bot added the size/M label

ocpdocs-previewbot commented Feb 6, 2026 •

edited

Loading

🤖 Mon Feb 09 17:49:26 - Prow CI generated the docs preview:

https://106102--ocpdocs-pr.netlify.app/openshift-enterprise/latest/scalability_and_performance/cnf-numa-aware-scheduling.html

ocpdocs-vale-bot reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Show resolved Hide resolved

ocpdocs-vale-bot reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated

    
              [id="cnf-balanceallocated-example_{context}"]

              == BalancedAllocation strategy example

              The `BalancedAllocation` strategy assigns workloads to the NUMA node with the most balanced resource utilization across CPU and memory. The goal is to prevent imbalanced usage, such as high CPU utilization with underutilized memory. Assume a worker node has the following NUMA node states:

              The `BalancedAllocation` strategy favors worker nodes that exhibit the most balanced resource utilization (CPU versus Memory) within their NUMA zones. This prevents _skewed_ usage where a node might be out of CPU cycles while having massive amounts of idle memory.

Collaborator

ocpdocs-vale-bot Feb 6, 2026

🤖 [error] RedHat.TermsErrors: Use 'compared to' rather than 'versus'. For more information, see RedHat.TermsErrors.

MarSik reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated

    
              . *Node Selection*: The scheduler first selects a suitable worker node based on cluster-wide criteria. For example taints, labels, or resource availability.

              . After a worker node is selected, the scheduler evaluates its NUMA nodes and applies a scoring strategy to decide which NUMA node will handle the workload.

              . *NUMA-Aware Scoring*: After a worker node is selected, the scheduler evaluates the available resources within each worker node's NUMA zones. It applies a scoring strategy to select the worker node that best fits the desired resource distribution.

MarSik Feb 6, 2026

This is confusing, because the Node selection talks about a singular selected node.

Change it so it says Node filtering and describe that it keeps only nodes that are suitable based on what you have + NUMA zone available resources.

The second step is then Node selection and that one uses the score to pick the "best" node out of the shortlist based on the strategy.

MarSik reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated

-              . After a workload is scheduled, the selected NUMA node’s resources are updated to reflect the allocation.
-              The default strategy applied is the `LeastAllocated` strategy. This assigns workloads to the NUMA node with the most available resources that is the least utilized NUMA node. The goal of this strategy is to spread workloads across NUMA nodes to reduce contention and avoid hotspots.
+              . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection.

MarSik Feb 6, 2026

Maybe

Suggested change

      
            . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection.
          
            . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (CPU and Topology Managers) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection.

CPU manager picks the CPUs based on input from all the others like Topology manager, Memory manager (for hugepages affinity) and other device managers (eg. for SR-IOV affinity).

MarSik reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated

-              The default strategy applied is the `LeastAllocated` strategy. This assigns workloads to the NUMA node with the most available resources that is the least utilized NUMA node. The goal of this strategy is to spread workloads across NUMA nodes to reduce contention and avoid hotspots.
+              . *Local Allocation*: Once the pod is assigned to a worker node, the node-level components (Topology Manager) perform the authoritative allocation of specific CPUs and memory. The scheduler does not influence this final selection.
               The following table summarizes the different strategies and their outcomes:

MarSik Feb 6, 2026

Suggested change

      
            The following table summarizes the different strategies and their outcomes:
          
            The following table summarizes the different OpenShift Node selection strategies and their outcomes:

MarSik reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated Show resolved Hide resolved

tshwartz reviewed

View reviewed changes

modules/cnf-numa-resource-scheduling-strategies.adoc Outdated Show resolved Hide resolved

kquinn1204 force-pushed the TELCODOCS-2644 branch from a936368 to 1569e2a Compare

February 9, 2026 15:13

ffromani commented Feb 9, 2026

LGTM with this caveat: when you write "(CPU and Topology Managers)" I'd either use "Topology and resource managers" to group the active managers, or spell them all out: "(CPU, Memory, Device and Topology managers)". I'm fine with both approaches.


          TELCODOCS-2644 clarify the NUMA aware scheduler scoring behavior

1aafec5

kquinn1204 force-pushed the TELCODOCS-2644 branch from 02d06d0 to 1aafec5 Compare

February 9, 2026 17:38

openshift-ci bot commented Feb 9, 2026

@kquinn1204: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M