Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions modules/cnf-measuring-latency.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-measuring-latency_{context}"]
= Measuring latency

[role="_abstract"]
The `cnf-tests` image uses three tools to measure the latency of the system:

* `hwlatdetect`
Expand Down
46 changes: 14 additions & 32 deletions modules/cnf-performing-end-to-end-tests-disconnected-mode.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,18 @@
[id="cnf-performing-end-to-end-tests-disconnected-mode_{context}"]
= Running latency tests in a disconnected cluster

[role="_abstract"]
The CNF tests image can run tests in a disconnected cluster that is not able to reach external registries. This requires two steps:

. Mirroring the `cnf-tests` image to the custom disconnected registry.

. Instructing the tests to consume the images from the custom disconnected registry.

[discrete]
[id="cnf-performing-end-to-end-tests-mirroring-images-to-custom-registry_{context}"]
== Mirroring the images to a custom registry accessible from the cluster

A `mirror` executable is shipped in the image to provide the input required by `oc` to mirror the test image to a local registry.
.Procedure

. Run this command from an intermediate machine that has access to the cluster and link:https://catalog.redhat.com/software/containers/explore[registry.redhat.io]:
. Mirror the images to a custom registry accessible from the cluster. A `mirror` executable is shipped in the image to provide the input required by `oc` to mirror the test image to a local registry.
+
Run this command from an intermediate machine that has access to the cluster and link:https://catalog.redhat.com/software/containers/explore[registry.redhat.io]:
+
[source,terminal,subs="attributes+"]
----
Expand All @@ -44,13 +43,9 @@ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
<disconnected_registry>/cnf-tests-rhel9:v{product-version} /usr/bin/test-run.sh --ginkgo.v --ginkgo.timeout="24h"
----

[discrete]
[id="cnf-performing-end-to-end-tests-image-parameters_{context}"]
== Configuring the tests to consume images from a custom registry

You can run the latency tests using a custom test image and image registry using `CNF_TESTS_IMAGE` and `IMAGE_REGISTRY` variables.

* To configure the latency tests to use a custom test image and image registry, run the following command:
. Configure the tests to consume images from a custom registry. You can run the latency tests using a custom test image and image registry using `CNF_TESTS_IMAGE` and `IMAGE_REGISTRY` variables.
+
To configure the latency tests to use a custom test image and image registry, run the following command:
+
[source,terminal,subs="attributes+"]
----
Expand All @@ -68,15 +63,9 @@ where:
<custom_cnf-tests_image> :: is the custom cnf-tests image, for example, `custom-cnf-tests-image:latest`.
--

[discrete]
[id="cnf-performing-end-to-end-tests-mirroring-to-cluster-internal-registry_{context}"]
== Mirroring images to the cluster {product-registry}

{product-title} provides a built-in container image registry, which runs as a standard workload on the cluster.

.Procedure

. Gain external access to the registry by exposing it with a route:
. Mirror images to the cluster {product-registry}. {product-title} provides a built-in container image registry, which runs as a standard workload on the cluster.
+
Gain external access to the registry by exposing it with a route:
+
[source,terminal]
----
Expand Down Expand Up @@ -147,17 +136,10 @@ $ podman run -v $(pwd)/:/kubeconfig:Z -e KUBECONFIG=/kubeconfig/kubeconfig \
-e IMAGE_REGISTRY=image-registry.openshift-image-registry.svc:5000/cnftests cnf-tests-local:latest /usr/bin/test-run.sh --ginkgo.v --ginkgo.timeout="24h"
----

[discrete]
[id="mirroring-different-set-of-images_{context}"]
== Mirroring a different set of test images

You can optionally change the default upstream images that are mirrored for the latency tests.

.Procedure

. The `mirror` command tries to mirror the upstream images by default. This can be overridden by passing a file with the following format to the image:
. Optional: Mirror a different set of test images. You can optionally change the default upstream images that are mirrored for the latency tests.
+
The `mirror` command tries to mirror the upstream images by default. This can be overridden by passing a file with the following format to the image:
+

[source,yaml,subs="attributes+"]
----
[
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-junit-test-output_{context}"]
= Generating a JUnit latency test report

[role="_abstract"]
Use the following procedures to generate a JUnit latency test output and test failure report.

.Prerequisites
Expand Down
11 changes: 6 additions & 5 deletions modules/cnf-performing-end-to-end-tests-running-cyclictest.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-running-cyclictest_{context}"]
= Running cyclictest

[role="_abstract"]
The `cyclictest` tool measures the real-time kernel scheduler latency on the specified CPUs.

[NOTE]
Expand Down Expand Up @@ -61,13 +62,12 @@ FAIL! -- 0 Passed | 1 Failed | 0 Pending | 2 Skipped
FAIL
----

[discrete]
[id="cnf-performing-end-to-end-tests-example-results-cyclictest_{context}"]
== Example cyclictest results
.Verification

The same output can indicate different results for different workloads. For example, spikes up to 18μs are acceptable for 4G DU workloads, but not for 5G DU workloads.

.Example of good results
The following example shows good results:

[source,terminal]
----
running cmd: cyclictest -q -D 10m -p 1 -t 16 -a 2,4,6,8,10,12,14,16,54,56,58,60,62,64,66,68 -h 30 -i 1000 -m
Expand Down Expand Up @@ -100,7 +100,8 @@ More histogram entries ...
# Thread 15:
----

.Example of bad results
The following example shows bad results:

[source,terminal]
----
running cmd: cyclictest -q -D 10m -p 1 -t 16 -a 2,4,6,8,10,12,14,16,54,56,58,60,62,64,66,68 -h 30 -i 1000 -m
Expand Down
23 changes: 12 additions & 11 deletions modules/cnf-performing-end-to-end-tests-running-hwlatdetect.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@
//
// * scalability_and_performance/low_latency_tuning/cnf-performing-platform-verification-latency-tests.adoc

:_mod-docs-content-type: CONCEPT
:_mod-docs-content-type: PROCEDURE
[id="cnf-performing-end-to-end-tests-running-hwlatdetect_{context}"]
= Running hwlatdetect

[role="_abstract"]
The `hwlatdetect` tool is available in the `rt-kernel` package with a regular subscription of {op-system-base-full} {op-system-version}.

[NOTE]
Expand Down Expand Up @@ -62,21 +63,21 @@ Will run 1 of 3 specs
Running on machine: hwlatdetect-b6n4n
Binary: Built with gc go1.17.12 for linux/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0908 15:25:27.160620 1 node.go:39] Environment information: /proc/cmdline: BOOT_IMAGE=(hd1,gpt3)/ostree/rhcos-c6491e1eedf6c1f12ef7b95e14ee720bf48359750ac900b7863c625769ef5fb9/vmlinuz-4.18.0-372.19.1.el8_6.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal ostree=/ostree/boot.1/rhcos/c6491e1eedf6c1f12ef7b95e14ee720bf48359750ac900b7863c625769ef5fb9/0 ip=dhcp root=UUID=5f80c283-f6e6-4a27-9b47-a287157483b2 rw rootflags=prjquota boot=UUID=773bf59a-bafd-48fc-9a87-f62252d739d3 skew_tick=1 nohz=on rcu_nocbs=0-3 tuned.non_isolcpus=0000ffff,ffffffff,fffffff0 systemd.cpu_affinity=4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79 intel_iommu=on iommu=pt isolcpus=managed_irq,0-3 nohz_full=0-3 tsc=nowatchdog nosoftlockup nmi_watchdog=0 mce=off skew_tick=1 rcutree.kthread_prio=11 + +
I0908 15:25:27.160620 1 node.go:39] Environment information: /proc/cmdline: BOOT_IMAGE=(hd1,gpt3)/ostree/rhcos-c6491e1eedf6c1f12ef7b95e14ee720bf48359750ac900b7863c625769ef5fb9/vmlinuz-4.18.0-372.19.1.el8_6.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal ostree=/ostree/boot.1/rhcos/c6491e1eedf6c1f12ef7b95e14ee720bf48359750ac900b7863c625769ef5fb9/0 ip=dhcp root=UUID=5f80c283-f6e6-4a27-9b47-a287157483b2 rw rootflags=prjquota boot=UUID=773bf59a-bafd-48fc-9a87-f62252d739d3 skew_tick=1 nohz=on rcu_nocbs=0-3 tuned.non_isolcpus=0000ffff,ffffffff,fffffff0 systemd.cpu_affinity=4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79 intel_iommu=on iommu=pt isolcpus=managed_irq,0-3 nohz_full=0-3 tsc=nowatchdog nosoftlockup nmi_watchdog=0 mce=off skew_tick=1 rcutree.kthread_prio=11
I0908 15:25:27.160830 1 node.go:46] Environment information: kernel version 4.18.0-372.19.1.el8_6.x86_64
I0908 15:25:27.160857 1 main.go:50] running the hwlatdetect command with arguments [/usr/bin/hwlatdetect --threshold 1 --hardlimit 1 --duration 100 --window 10000000us --width 950000us]
F0908 15:27:10.603523 1 main.go:53] failed to run hwlatdetect command; out: hwlatdetect: test duration 100 seconds
detector: tracer
parameters:
Latency threshold: 1us <1>
Latency threshold: 1us
Sample window: 10000000us
Sample width: 950000us
Non-sampling period: 9050000us
Output File: None

Starting test
test finished
Max Latency: 326us <2>
Max Latency: 326us
Samples recorded: 5
Samples exceeding threshold: 5
ts: 1662650739.017274507, inner:6, outer:6
Expand All @@ -100,20 +101,19 @@ FAIL! -- 0 Passed | 1 Failed | 0 Pending | 2 Skipped
--- FAIL: TestTest (366.08s)
FAIL
----
<1> You can configure the latency threshold by using the `MAXIMUM_LATENCY` or the `HWLATDETECT_MAXIMUM_LATENCY` environment variables.
<2> The maximum latency value measured during the test.
* You can configure the latency threshold by using the `MAXIMUM_LATENCY` or the `HWLATDETECT_MAXIMUM_LATENCY` environment variables.
* The maximum latency value measured during the test.

[discrete]
[id="cnf-performing-end-to-end-tests-example-results-hwlatdetect_{context}"]
== Example hwlatdetect test results
.Verification

You can capture the following types of results:

* Rough results that are gathered after each run to create a history of impact on any changes made throughout the test.

* The combined set of the rough tests with the best results and configuration settings.

.Example of good results
The following example shows good results:

[source,terminal]
----
hwlatdetect: test duration 3600 seconds
Expand All @@ -133,7 +133,8 @@ Samples recorded: 0

The `hwlatdetect` tool only provides output if the sample exceeds the specified threshold.

.Example of bad results
The following example shows bad results:

[source,terminal]
----
hwlatdetect: test duration 3600 seconds
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-running-in-single-node-cluster_{context}"]
= Running latency tests on a {sno} cluster

[role="_abstract"]
You can run latency tests on {sno} clusters.

[NOTE]
Expand Down
6 changes: 4 additions & 2 deletions modules/cnf-performing-end-to-end-tests-running-oslat.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-running-oslat_{context}"]
= Running oslat

[role="_abstract"]
The `oslat` test simulates a CPU-intensive DPDK application and measures all the interruptions and disruptions to test how the cluster handles CPU heavy data processing.

[NOTE]
Expand Down Expand Up @@ -60,7 +61,7 @@ Will run 1 of 3 specs
should succeed [It]
/remote-source/app/vendor/github.com/openshift/cluster-node-tuning-operator/test/e2e/performanceprofile/functests/4_latency/latency.go:153

The current latency 304 is bigger than the expected one 1 : <1>
The current latency 304 is bigger than the expected one 1 :

[...]

Expand All @@ -74,4 +75,5 @@ FAIL! -- 0 Passed | 1 Failed | 0 Pending | 2 Skipped
--- FAIL: TestTest (161.42s)
FAIL
----
<1> In this example, the measured latency is outside the maximum allowed value.
+
In this example, the measured latency is outside the maximum allowed value.
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-running-the-tests_{context}"]
= Running the latency tests

[role="_abstract"]
Run the cluster latency tests to validate node tuning for your Cloud-native Network Functions (CNF) workload.

[NOTE]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-test-failure-report_{context}"]
= Generating a latency test failure report

[role="_abstract"]
Use the following procedures to generate a JUnit latency test output and test failure report.

.Prerequisites
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
[id="cnf-performing-end-to-end-tests-troubleshooting_{context}"]
= Troubleshooting errors with the cnf-tests container

[role="_abstract"]
To run latency tests, the cluster must be accessible from within the `cnf-tests` container.

.Prerequisites
Expand Down