Skip to content

Flink: Allow custom writer factory in DynamicIcebergSink#16339

Open
sqd wants to merge 1 commit into
apache:mainfrom
sqd:oss_custom_writer
Open

Flink: Allow custom writer factory in DynamicIcebergSink#16339
sqd wants to merge 1 commit into
apache:mainfrom
sqd:oss_custom_writer

Conversation

@sqd
Copy link
Copy Markdown
Contributor

@sqd sqd commented May 14, 2026

Introduce DynamicTaskWriterFactoryProvider so callers can supply a custom TaskWriterFactory in place of the default RowDataTaskWriterFactory, while reusing the surrounding table, schema, partition spec, and write-property resolution already done in DynamicWriter.

The primary motivation is throughput. Our pipelines have a data pattern tied deeply into business logic that a hand-rolled TaskWriter can exploit to produce files far faster than the generic RowDataTaskWriterFactory.

Making the factory pluggable also enables other use cases without forking the sink:

  • Row-level or file-level audit and metrics: sampling, lineage stamps, metric counters layered around the writer.
  • Custom file naming and layout: custom prefixes, alternative partition paths, custom filesystem properties such as storage class and permissions.

The default provider preserves existing behavior, so callers that do not supply one are unaffected.

Introduce DynamicTaskWriterFactoryProvider so callers can supply a
custom TaskWriterFactory<RowData> in place of the default
RowDataTaskWriterFactory, while reusing the surrounding table,
schema, partition spec, and write-property resolution already done
in DynamicWriter.

The primary motivation is throughput. Our pipelines have a data
pattern tied deeply into business logic that a hand-rolled
TaskWriter can exploit to produce files far faster than the
generic RowDataTaskWriterFactory.

Making the factory pluggable also enables other use cases without
forking the sink:

  - Row-level or file-level audit and metrics: sampling, lineage
    stamps, metric counters layered around the writer.
  - Custom file naming and layout: custom prefixes, alternative
    partition paths, custom filesystem properties such as storage
    class and permissions.

The default provider preserves existing behavior, so callers that
do not supply one are unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant