-
Notifications
You must be signed in to change notification settings - Fork 1.9k
OSDOCS-16871-1-abstracts: SCALE-1: Core Scalability Planning and Reso… #109155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Limit ranges in a LimitRange object | ||
|
|
||
| [role="_abstract"] | ||
| To define compute resource constraints at the object level, create a `LimitRange` object. By creating this object, you can specify the exact amount of resources that an individual pod, container, image, or persistent volume claim can consume. | ||
| To define compute resource constraints at the object level, create a `LimitRange` object. By creating this object, you can specify the exact amount of resources that an individual pod, container, image, image stream, or persistent volume claim can consume. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| All requests to create and modify resources are evaluated against each `LimitRange` object in the project. If the resource violates any of the enumerated constraints, the resource is rejected. If the resource does not set an explicit value, and if the constraint supports a default value, the default value is applied to the resource. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Admin quota usage | ||
|
|
||
| [role="_abstract"] | ||
| To ensure projects remain within defined constraints, monitor admin quota usage. By tracking the aggregate consumption of compute resources and storage, you can identify when `ResourceQuota` limits are reached or approached. | ||
| To ensure projects remain within defined constraints, monitor admin quota usage. After a resource quota for a project is first created, the project restricts the ability to create any new resources that can violate a quota constraint until it has calculated updated usage statistics. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| Quota enforcement:: | ||
| + | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Configure guest caching for disk | ||
|
|
||
| [role="_abstract"] | ||
| To ensure that the guest manages caching instead of the host, configure your disk devices. This setting shifts caching responsibility to the guest operating system, preventing the host from caching disk operations. | ||
| To ensure that the guest manages caching instead of the host, configure your disk devices. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| Ensure that the driver element of the disk device includes the `cache="none"` and `io="native"` parameters. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,7 +8,7 @@ | |
| = Configuring huge pages at boot time | ||
|
|
||
| [role="_abstract"] | ||
| To ensure nodes in your {product-title} cluster pre-allocate memory for specific workloads, reserve huge pages at boot time. This configuration sets aside memory resources during system startup, offering a distinct alternative to run-time allocation. | ||
| To ensure nodes in your {product-title} cluster pre-allocate memory for specific workloads, reserve huge pages at boot time. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| There are two ways of reserving huge pages: at boot time and at run time. Reserving at boot time increases the possibility of success because the memory has not yet been significantly fragmented. The Node Tuning Operator currently supports boot-time allocation of huge pages on specific nodes. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Configuring quota synchronization period | ||
|
|
||
| [role="_abstract"] | ||
| To control the synchronization time frame when resources are deleted, configure the `resource-quota-sync-period` setting. This parameter in the `/etc/origin/master/master-config.yaml` file determines how frequently the system updates usage statistics to reflect deleted resources. | ||
| When a set of resources are deleted, the synchronization time frame of resources is determined by the `resource-quota-sync-period` setting in the `/etc/origin/master/master-config.yaml` file. You can change the `resource-quota-sync-period` setting to have the set of resources regenerate in the needed amount of time (in seconds) for the resources to be once again available. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| [NOTE] | ||
| ==== | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,7 +9,7 @@ | |
| = Consuming huge pages resources using the Downward API | ||
|
|
||
| [role="_abstract"] | ||
| To inject information about the huge pages resources consumed by a container, use the Downward API. This configuration enables applications to retrieve and use their own memory usage data directly. | ||
| To inject information about the huge pages resources consumed by a container, use the Downward API. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| You can inject the resource allocation as environment variables, a volume plugin, or both. Applications that you develop and run in the container can determine the resources that are available by reading the environment variables or files in the specified volumes. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Performance profiles and workload partitioning | ||
|
|
||
| [role="_abstract"] | ||
| To enable workload partitioning, apply a performance profile. This configuration specifies the isolated and reserved CPUs, ensuring that customer workloads run on dedicated cores without interruption from platform processes. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| To enable workload partitioning, apply a performance profile. | ||
|
|
||
| An appropriately configured performance profile specifies the `isolated` and `reserved` CPUs. Create a performance profile by using the Performance Profile Creator (PPC) tool. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Disabling the cpuset cgroup controller | ||
|
|
||
| [role="_abstract"] | ||
| To allow the kernel scheduler to freely distribute processes across all available resources, disable the `cpuset` cgroup controller. This configuration prevents the system from enforcing processor affinity constraints, ensuring that tasks can use any available CPU or memory node. | ||
| You can disable the cpuset cgroup controller. Disabling the controller requires a restart of the libvirtd daemon. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| [NOTE] | ||
| ==== | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Enabling workload partitioning | ||
|
|
||
| [role="_abstract"] | ||
| To partition cluster management pods into a specified CPU affinity, enable workload partitioning. This configuration ensures that management pods operate within the reserved CPU limits defined in your Performance Profile, preventing them from consuming resources intended for customer workloads. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| To partition cluster management pods into a specified CPU affinity, enable workload partitioning. This configuration ensures that management pods operate within the reserved CPU limits defined in your Performance Profile. | ||
|
|
||
| Consider additional post-installation Operators that use workload partitioning when calculating how many reserved CPU cores to set aside for the platform. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,7 +8,7 @@ | |
| = How huge pages are consumed by apps | ||
|
|
||
| [role="_abstract"] | ||
| To enable applications to consume huge pages, nodes must pre-allocate these memory segments to report capacity. Because a node can only pre-allocate huge pages for a single size, you must align this configuration with your specific workload requirements. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| You must ensure that nodes pre-allocate huge pages in order for the node to report its huge page capacity. A node can only pre-allocate huge pages for a single size. | ||
|
|
||
| Huge pages can be consumed through container-level resource requirements by using the resource name `hugepages-<size>`, where size is the most compact binary notation by using integer values supported on a particular node. For example, if a node supports 2048 KiB page sizes, the node exposes a schedulable resource `hugepages-2Mi`. Unlike CPU or memory, huge pages do not support over-commitment. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Boosting networking performance with RFS | ||
|
|
||
| [role="_abstract"] | ||
| To boost networking performance, activate Receive Flow Steering (RFS) by using the Machine Config Operator (MCO). This configuration improves packet processing efficiency by directing network traffic to specific CPUs. | ||
| To boost networking performance, activate Receive Flow Steering (RFS) by using the Machine Config Operator (MCO). This configuration improves packet processing efficiency. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| RFS extends Receive Packet Steering (RPS) by further reducing network latency. RFS is technically based on RPS, and improves the efficiency of packet processing by increasing the CPU cache hit rate. RFS achieves this, while considering queue length, by determining the most convenient CPU for computation so that cache hits are more likely to occur within the CPU. This means that the CPU cache is invalidated less and requires fewer cycles to rebuild the cache, which reduces packet processing run time. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Choose your networking setup | ||
|
|
||
| [role="_abstract"] | ||
| To optimize performance for specific workloads and traffic patterns, select a networking setup based on your chosen hypervisor. This configuration ensures the networking stack meets the operational requirements of {product-title} clusters on IBM Z infrastructure. | ||
| For {ibm-z-name} setups, the networking setup depends on the hypervisor of your choice. Depending on the workload and the application, the best fit usually changes with the use case and the traffic pattern. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| The networking stack is one of the most important components for a Kubernetes-based product like {product-title}. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,8 @@ | |
| = {op-system-base} KVM on {ibm-z-title} host recommendations | ||
|
|
||
| [role="_abstract"] | ||
| To optimize Kernel-based Virtual Machine (KVM) performance on {ibm-z-title}, apply host recommendations. Because optimal settings depend strongly on specific workloads and available resources, finding the best balance for your {op-system-base} environment often requires experimentation to avoid adverse effects. | ||
| To optimize Kernel-based Virtual Machine (KVM) performance on {ibm-z-title}, apply host recommendations. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| Optimizing a KVM virtual server environment strongly depends on the workloads of the virtual servers and on the available resources. The same action that enhances performance in one environment can have adverse effects in another. Finding the best balance for a particular setting can be a challenge and often involves experimentation. | ||
|
|
||
| The following sections introduces some best practices when using {product-title} with {op-system-base} KVM on {ibm-z-name} and {ibm-linuxone-name} environments. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Maintaining bare metal hosts | ||
|
|
||
| [role="_abstract"] | ||
| To ensure your cluster inventory accurately reflects your physical infrastructure, maintain the details of the bare-metal host configurations by using the {product-title} web console. | ||
| You can maintain the details of the bare metal hosts in your cluster from the {product-title} web console. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤖 [error] RedHat.TermsErrors: Use 'bare-metal hosts' rather than 'bare metal hosts'. For more information, see RedHat.TermsErrors.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| .Procedure | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ | |
| = Tuning the CPU migration algorithm of the host scheduler | ||
|
|
||
| [role="_abstract"] | ||
| To optimize task distribution and reduce latency, tune the CPU migration algorithm of the host scheduler. With this configuration, you can adjust how the kernel balances processes across available CPUs, ensuring efficient resource usage for your specific workloads. | ||
| You can tune the CPU migration algorithm of the host scheduler to meet the demands of your production system. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| [IMPORTANT] | ||
| ==== | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ include::_attributes/common-attributes.adoc[] | |
| toc::[] | ||
|
|
||
| [role="_abstract"] | ||
| To prevent platform processes from interrupting your applications, configure workload partitioning. This isolates {product-title} services and infrastructure pods to a reserved set of CPUs, ensuring that the remaining compute resources are available exclusively for your customer workloads. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| Workload partitioning separates compute node CPU resources into distinct CPU sets. Ensure that you keep platform pods on the specified cores to avoid interrupting the CPUs the customer workloads are running on. | ||
|
|
||
| The minimum number of reserved CPUs required for the cluster management is four CPU Hyper-Threads (HTs). | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ include::_attributes/common-attributes.adoc[] | |
| toc::[] | ||
|
|
||
| [role="_abstract"] | ||
| To optimize performance on mainframe infrastructure, apply host practices, so that you can configure {ibm-z-title} and {ibm-linuxone-name} environments to ensure your s390x architecture meets specific operational requirements | ||
| You can apply host practices for {ibm-z-title} and {ibm-linuxone-name} environments to ensure your s390x architecture meets specific operational requirements. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| The s390x architecture is unique in many aspects. Some host practice recommendations might not apply to other platforms. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,7 @@ include::_attributes/common-attributes.adoc[] | |
| toc::[] | ||
|
|
||
| [role="_abstract"] | ||
| To optimize memory management for specific workloads, configure huge pages. By using these Linux-based system page sizes, you can maintain manual control over memory allocation and override automatic system behaviors. | ||
| To optimize memory management for specific workloads, configure huge pages. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
|
||
| include::modules/what-huge-pages-do.adoc[leveloffset=+1] | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original CQA https://github.com/openshift/openshift-docs/pull/106034/changes#diff-1ff84d4f4abe584b3ef88483554a5c297075e2277b293aa2507cc0c753de2447L6