[SPARK-56543] Add RTM stateless benchmark by jerrypeng · Pull Request #55420 · apache/spark

jerrypeng · 2026-04-20T01:09:06Z

What changes were proposed in this pull request?

Adds RTMKafkaKafkaBenchmark, a standalone benchmark program for the Real-Time Mode (RTM) trigger in Structured Streaming. It is a stateless end-to-end Kafka-to-Kafka latency benchmark.

The benchmark is implemented as an object extending org.apache.spark.benchmark.BenchmarkBase, following the same convention as other Spark benchmarks (e.g.
StateStoreBasicOperationsBenchmark, MapStatusesSerDeserBenchmark). It is not a ScalaTest suite, so it is not discovered or executed by SBT test or Maven surefire — it only runs when
invoked explicitly via runMain or spark-submit.

The benchmark:

Spins up a local-cluster Spark context (local-cluster[3, 5, 1024]) and a live embedded Kafka broker via KafkaTestUtils.
Generates synthetic records at 1,000 records/sec into an input Kafka topic (5 partitions) from a background producer thread.
Runs a stateless pipeline with RealTimeTrigger: reads from Kafka → base64-encodes the value → stamps a source-timestamp header → writes to an output Kafka topic.
Captures per-batch processing latency via Spark's observe() API.
After N batches complete, reads back the output topic and reports e2e latency percentiles (p0, p50, p90, p95, p99, p100) by comparing the source-timestamp header to the Kafka sink
timestamp.
Owns its own teardown via a try { ... } finally { cleanup() } inside runBenchmarkSuite, with an idempotent cleanup() that stops Spark and tears down the embedded Kafka broker even if
setup partially fails, the streaming query times out, or post-run analysis throws.

Sample benchmark results

 Kafka to kafka query e2e_latency in milliseconds is
  p0:   45
  p50:  70
  p90:  78
  p95:  81
  p99:  85
  p100: 331

Why are the changes needed?

There is currently no benchmark to measure RTM stateless Kafka-to-Kafka latency. This makes it hard to quantify regressions or improvements to the RTM code path during local development
or before merging changes. This benchmark provides a repeatable, self-contained way to measure that, and follows the existing Spark benchmark framework so result files can be committed
and diffed across runs.

Does this PR introduce any user-facing change?

no

How was this patch tested?

N/A. Only a benchmark was added.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Sonnet 4.6 (claude-sonnet-4-6)

jerrypeng · 2026-04-24T06:26:19Z

@viirya thank you for your review! I have addressed your comments. PTAL.

jerrypeng · 2026-04-28T05:09:10Z

@viirya thank you for your review! I have addressed your comments. PTAL.

jerrypeng · 2026-05-07T05:21:41Z

@viirya I have address your comments PTAL. Thanks in advance!

viirya

The source-level run instructions are now consistent with Spark benchmark style: sql-kafka-0-10/Test/runMain ..., and output support via SPARK_GENERATE_BENCHMARK_FILES=1 is aligned with BenchmarkBase.

The PR description is stale: it still says RTMKafkaKafkaBenchmarkSuite and testOnly *RTMKafkaKafkaBenchmarkSuite. That should be updated to the new object name and Test/runMain command, but that is documentation cleanup rather than a code blocker.

viirya · 2026-05-12T17:18:06Z

+  private var spark: SparkSession = _
+  private var testUtils: KafkaTestUtils = _
+
+  override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {


BenchmarkBase.main calls runBenchmarkSuite(args) and only calls afterAll() afterwards; it does not wrap runBenchmarkSuite in try/finally. This benchmark starts embedded Kafka and a local-cluster Spark session in runBenchmarkSuite, then relies on afterAll() for teardown. If benchmark(...) times out, the query fails, getLatencies throws, or setup partially fails after Kafka starts, afterAll() will not run, leaving Kafka/Spark resources behind. Since this benchmark intentionally runs heavyweight local resources, it should handle its own exception path, e.g. wrap setup/run in try/finally or call an idempotent cleanup method on failure.

sure will add. Though the resources will not really be leaked as 1) Kafka is run in the same process and 2) workers will shutdown themselves down when the driver is not reachable.

viirya

It looks almost good to merge except for one minor issue and PR description cleanup. After fixing that, we can merge this.

viirya

Btw, you added a new benchmark but you don't add benchmark result file?

If no benchmark result file, how do we know if later PRs make improvement or regression?

Could you run it and add benchmark result too?

jerrypeng · 2026-05-13T22:03:05Z

@viirya added.

viirya · 2026-05-14T06:57:56Z

+ * Example output from a recent run (Linux x86_64, OpenJDK 17):
+ * {{{
+ *   Kafka to kafka query e2e_latency in milliseconds is
+ *   p0:   45


Could you take a look at the existing benchmark result files? We usually have benchmark environment in the result file. We should have it in this benchmark result too.

[SPARK-56543] Add RTM stateless benchmark

0817bb6

jerrypeng force-pushed the SPARK-56543 branch from 804ccea to 0817bb6 Compare April 21, 2026 18:07

viirya reviewed Apr 24, 2026

View reviewed changes

Comment thread ...afka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/RTMKafkaKafkaBenchmarkSuite.scala Outdated

viirya reviewed Apr 24, 2026

View reviewed changes

Comment thread ...0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/benchmark/RTMKafkaKafkaBenchmark.scala

viirya reviewed Apr 24, 2026

View reviewed changes

Comment thread ...afka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/RTMKafkaKafkaBenchmarkSuite.scala Outdated

viirya reviewed Apr 24, 2026

View reviewed changes

Comment thread ...afka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/RTMKafkaKafkaBenchmarkSuite.scala Outdated

addressing comments

56b296a

viirya reviewed Apr 27, 2026

View reviewed changes

Comment thread ...afka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/RTMKafkaKafkaBenchmarkSuite.scala Outdated

viirya reviewed Apr 27, 2026

View reviewed changes

Comment thread ...0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/benchmark/RTMKafkaKafkaBenchmark.scala

jerrypeng added 2 commits April 28, 2026 05:01

addressing comments

b3ab042

fix comment

93f842d

jerrypeng requested a review from viirya April 28, 2026 05:09

addressing comments

0ea9e8b

viirya reviewed May 12, 2026

View reviewed changes

addressing comments

9725b11

viirya approved these changes May 13, 2026

View reviewed changes

viirya reviewed May 13, 2026

View reviewed changes

addressing feedback

3cb728e

viirya reviewed May 14, 2026

View reviewed changes

Conversation

jerrypeng commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerrypeng commented Apr 24, 2026

Uh oh!

Uh oh!

Uh oh!

jerrypeng commented Apr 28, 2026

Uh oh!

jerrypeng commented May 7, 2026

Uh oh!

viirya left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jerrypeng May 13, 2026

Choose a reason for hiding this comment

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

viirya left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerrypeng commented May 13, 2026

Uh oh!

viirya May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jerrypeng commented Apr 20, 2026 •

edited

Loading

viirya left a comment •

edited

Loading

viirya left a comment •

edited

Loading