Skip to content

Feat: Add GKE Redis Benchmarking Support (Standalone & HA w/ HAProxy)#6462

Draft
manojcns wants to merge 6 commits intoGoogleCloudPlatform:masterfrom
manojcns:redis-gke
Draft

Feat: Add GKE Redis Benchmarking Support (Standalone & HA w/ HAProxy)#6462
manojcns wants to merge 6 commits intoGoogleCloudPlatform:masterfrom
manojcns:redis-gke

Conversation

@manojcns
Copy link

@manojcns manojcns commented Feb 10, 2026

This PR introduces Optimized and High Availability (HA) Redis benchmarking capabilities for Google Kubernetes Engine (GKE), significantly extending the existing baseline GKE Redis support in PKB.

Summary

While PKB already supports baseline Redis on Kubernetes, this contribution adds specialized benchmarks for Optimized GKE workloads and HA topologies. These new benchmarks leverage GKE-specific network features (like Host Networking) and robust architectures (HAProxy with Primary-Replica) to fully test the performance limits of Google Cloud infrastructure (C4, N2 machine types).

Key Features

  • New Optimized Benchmark (gke_optimized_redis_memtier_v2):
    • Host Networking: Bypasses the CNI overlay to bind directly to the node's network interface for reduced latency.
    • Advanced Tuning: Dynamically manages IO threads and disables RDB snapshots to isolate memory/network performance.
    • GKE Specifics: leverages GKE Node Auto Provisioning and specific machine type optimizations.
  • New HA Benchmark (gke_redis_ha_haproxy):
    • Regional Topology: Deploys a full Primary-Replica architecture across zones.
    • HAProxy Integration: Includes a dedicated HAProxy layer for read/write splitting and failover scenario testing.
  • Durable Storage Testing:
    • AOF Persistence: New scenarios to support AOF-enabled workloads (append-only file) for data durability testing.

Ecosystem Support

  • Versions: Verified support for Redis 6, 7, 8 and Valkey 8.
  • Workload Ratios: Validated for 1:1 (Write-heavy) and 1:4 (Read-heavy) caching scenarios.

Code Structure & Implementation Details

-- New Files

  • [perfkitbenchmarker/linux_benchmarks/gke_optimized_redis_memtier_v2_benchmark.py]: Core logic for the Standalone Optimized benchmark. Implements GKE-specific tuning like dynamic IO thread allocation and Host Networking injection.
  • [perfkitbenchmarker/linux_benchmarks/gke_redis_ha_haproxy_benchmark.py]: Core logic for the High Availability (HA) benchmark. Orchestrates Primary, Replica, and HAProxy nodes and coordinates failover/read-write split testing.
  • [perfkitbenchmarker/linux_packages/haproxy.py]: New package to install, configure, and start HAProxy services for the HA benchmark.
  • [perfkitbenchmarker/container_service.py]: Helper functions for container management (required by Kubernetes VM logic).
  • [docs/GKE_Redis_Quickstart_generic.md]: Comprehensive user guide with example run commands.
  • [docs/Technical_Architecture_Redis_PKB.md]: Deep-dive technical document explaining GKE optimizations (Host Networking, IO threading) and architecture logic.

-- Modified Files

  • [perfkitbenchmarker/configs/default_benchmark_config.yaml]: * Added gke_optimized_redis_memtier configuration to register the new benchmark.
  • [perfkitbenchmarker/linux_benchmarks/redis_memtier_benchmark.py]: * Updated StartServices to respect redis_server_io_threads flag, allowing dynamic assignment of IO threads for optimized runs.
  • [perfkitbenchmarker/linux_packages/redis_server.py]:
    * Added support for REDIS_8_0 and updated installation logic to handle newer build dependencies.
    * Refactored Start and Stop commands to use configurable binary names (supporting valkey-server alongside redis-server).
    * Improved shutdown logic with retries to prevent race conditions during teardown.
    * Added ConfigureReplication method to support Primary-Replica pairing.
  • [perfkitbenchmarker/resources/kubernetes/kubernetes_virtual_machine.py]:
    * Enhanced _IsKubectlErrorEphemeral to treat "deadline exceeded" errors as retriable, improving stability on GKE.
    * Added a small retry sleep (3s) for ephemeral kubectl connection issues.
  • perfkitbenchmarker/managed_memory_store.py:
    * Added REDIS_8_0 to the supported versions list.

Example Run Command:

python3 pkb.py \
  --benchmarks=gke_optimized_redis_memtier_v2 \
  --cloud=GCP \
  --vm_platform=Kubernetes \
  --zone=us-east1-b \
  --project=$PROJECT_ID \
  --os_type=ubuntu2404 \
  --gke_release_channel=rapid \
  --gke_max_cpu=1000 \
  --gke_max_memory=4000 \
  --gke_redis_v2_machine_type=c4-standard-4 \
  --gke_redis_v2_server_machine_type=c4-standard-4 \
  --gke_redis_v2_client_machine_type=c4-standard-32 \
  --gke_redis_v2_enable_optimization=True \
  --config_override=gke_optimized_redis_memtier_v2.container_cluster.nodepools.servers.vm_spec.GCP.boot_disk_type=hyperdisk-balanced \
  --config_override=gke_optimized_redis_memtier_v2.container_cluster.nodepools.clients.vm_spec.GCP.boot_disk_type=hyperdisk-balanced \
  --config_override=gke_optimized_redis_memtier_v2.vm_groups.servers.vm_spec.Kubernetes.host_network=True \
  --memtier_key_pattern=R:R \
  --memtier_distinct_client_seed=True \
  --memtier_key_maximum=6400000 \
  --redis_server_enable_snapshots=False \
  --redis_server_version=7.2.6 \
  --redis_git_repo=https://github.com/redis/redis.git \
  --redis_type=redis \
  --redis_eviction_policy=allkeys-lru \
  --iostat=True \
  --sar=False \
  --memtier_data_size=1024 \
  --memtier_ratio=1:4 \
  --memtier_threads=32 \
  --memtier_clients=12 \
  --memtier_run_duration=300 \
  --memtier_run_count=1 \
  --memtier_pipeline=1 \
  --redis_aof=False \
  --create_and_boot_post_task_delay=180 \
  --temp_dir=./pkb_temp \
  --owner=$(whoami | tr '.' '-') \
  --log_level=error \
  --accept_licenses
  

@google-cla
Copy link

google-cla bot commented Feb 10, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant