Skip to content

[GSoC 2026] Kafka Streams Runner — skeleton Gradle module + pipeline entry points #38465

@junaiddshaukat

Description

@junaiddshaukat

Tracking issue: #18479

Summary

First sub-issue under the Kafka Streams Runner GSoC 2026 project. Scope is the
minimum surface area needed for a portable pipeline to be submittable
to a KafkaStreamsRunner. Pipeline execution will still fail at translate-time
because no transform translators exist yet — that's intentional. Subsequent
sub-issues will incrementally add translators (Impulse, ExecutableStage,
Flatten, etc.), each shrinking the set of "unsupported transform" failures.

Design doc reference

Portable Kafka Streams Runner for Apache Beam — design doc
(co-authored with @je-ik).

Scope of this issue

Create runners/kafka-streams/ as a new Gradle module with:

  • runners/kafka-streams/build.gradle — module build file, declares
    dependencies on runners-core-java, runners-java-fn-execution,
    runners-java-job-service, kafka-clients 3.9.x, kafka-streams 3.9.x,
    and pulls in the standard Beam testing dependencies.
  • Update settings.gradle.kts to include :runners:kafka-streams.
  • KafkaStreamsRunner extending PipelineRunner<PipelineResult>
    delegates to the portable pipeline runner path.
  • KafkaStreamsPipelineOptions extending PortablePipelineOptions
    bootstrapServers, applicationId, processingGuarantee,
    maxBundleSize, maxBundleTimeMs, stateDir.
  • KafkaStreamsRunnerRegistrar — auto-discovery via
    META-INF/services/org.apache.beam.sdk.options.PipelineOptionsRegistrar
    and PipelineRunnerRegistrar.
  • Empty stubs (just enough to compile):
    - KafkaStreamsJobServerDriver extending JobServerDriver
    - KafkaStreamsJobInvoker extending JobInvoker
    - translation/KafkaStreamsPipelineTranslator
    - translation/KafkaStreamsTranslationContext
  • package-info.java in each new package.

Acceptance criteria

  • ./gradlew :runners:kafka-streams:compileJava passes with no errors.
  • ./gradlew :runners:kafka-streams:check passes (style, spotless,
    licence-header checks).
  • Submitting a trivial pipeline (e.g. Impulse only) to the runner via
    its JobInvoker does not crash before reaching the translator — it
    should fail inside the translator with a clear
    "no translator registered for URN ..." error.

Out of scope (deferred to follow-up sub-issues)

  • Any actual transform translator (Impulse, ExecutableStage, GBK, etc.)
  • Watermark manager
  • State management
  • Tests beyond compile + style — full unit / integration tests come once at
    least Impulse + one stateless ParDo are wired up.

Reference implementation pattern

Follows the Flink portable runner module layout:

  • runners/flink/2.0/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java
  • runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java
  • runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobInvoker.java

(I'm not copying code — using these as a structural reference for the URN
dispatch + JobInvoker pattern. The Kafka Streams runner does not need
multi-version build fan-out like Flink does, so the module layout will be
single-version.)

cc @je-ik

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions