Skip to content

Conversation

@kazamatzuri
Copy link

@kazamatzuri kazamatzuri commented Dec 26, 2025

Summary

This PR adds support for variable message size distributions in the OpenMessaging Benchmark framework, enabling more realistic workload testing that mirrors production message size patterns.

it also bumps maven and lombok

Motivation

Real-world messaging workloads rarely have uniform message sizes. Production systems typically see a distribution of message sizes - many small messages, fewer medium-sized ones, and occasional large payloads. The existing benchmark framework only supports fixed message sizes, which can lead to misleading benchmark results that do not reflect actual production behavior.

Changes

New Feature: messageSizeDistribution Configuration

Instead of specifying a single messageSize, users can now define a weighted distribution of message sizes in their workload YAML:

messageSizeDistribution:
  "0-256": 234        # 234 weight for messages 0-256 bytes
  "256-1024": 456     # 456 weight for messages 256B-1KB
  "1024-4096": 678    # 678 weight for messages 1-4KB
  "4096-16384": 312   # 312 weight for messages 4-16KB
  "16384-65536": 98   # 98 weight for messages 16-64KB
  "65536-262144": 45  # 45 weight for messages 64-256KB
  "262144-1048576": 18  # 18 weight for messages 256KB-1MB
  "1048576-5242880": 6  # 6 weight for messages 1-5MB

Key Implementation Details

  1. MessageSizeDistribution class (benchmark-framework/src/main/java/.../utils/payload/MessageSizeDistribution.java)

    • Parses range-based configuration (supports B, KB, MB suffixes)
    • Computes midpoint of each range as the representative payload size
    • Provides cumulative weights for O(log n) weighted random selection
  2. Workload class updates

    • Added messageSizeDistribution field (Map<String, Integer>)
    • Added usesDistribution() helper method
    • messageSizeDistribution and messageSize are mutually exclusive - if distribution is set, messageSize is ignored
  3. WorkloadGenerator updates

    • Creates one payload per bucket when distribution mode is enabled
    • Passes weights to workers for runtime selection
    • Uses weighted average size for backlog calculations and result reporting
  4. LocalWorker updates

    • Implements weighted payload selection using binary search on cumulative weights
    • Falls back to uniform random selection when weights are not provided (backward compatible)
  5. ProducerWorkAssignment updates

    • Added payloadWeights field to transmit distribution weights to workers

Backward Compatibility

  • Existing workloads with messageSize continue to work unchanged
  • The weighted selection logic only activates when payloadWeights is provided
  • All existing tests pass

Testing

Added comprehensive unit tests in MessageSizeDistributionTest.java:

  • Basic range parsing
  • Size suffix parsing (KB, MB)
  • Midpoint calculations
  • Weight arrays and cumulative weights
  • Average and max size calculations
  • Production-like distribution handling
  • Error cases (null, empty, invalid formats, negative weights)
  • Weighted selection distribution verification

Files Changed

File Description
Workload.java Added messageSizeDistribution field and usesDistribution() method
WorkloadGenerator.java Distribution-aware payload generation and backlog calculations
MessageSizeDistribution.java New class for parsing and representing distributions
LocalWorker.java Weighted payload selection with binary search
ProducerWorkAssignment.java Added payloadWeights field
MessageSizeDistributionTest.java Comprehensive unit tests
pom.xml Updated to compile with Java 21

Example Usage

name: production-like-workload
topics: 1
partitionsPerTopic: 16
messageSizeDistribution:
  "0-256": 234
  "256-1024": 456
  "1024-4096": 678
producersPerTopic: 4
consumerPerSubscription: 4
producerRate: 10000
testDurationMinutes: 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants