Skip to content

[Feature Request]: ClickHouseIO: Add DateTime64 support for sub-second timestamp precision #38466

@BentsiLeviav

Description

@BentsiLeviav

Problem

The ClickHouse IO currently supports only DateTime (second precision) via TypeName.DATETIME, which maps to Beam's Schema.FieldType.DATETIME (millisecond precision Joda Instant).
ClickHouse's DateTime64(precision, [timezone]) type — which supports millisecond, microsecond, and nanosecond precision — is not recognized by TableSchema or the column type parser, so users
cannot write sub-second timestamps to ClickHouse tables that use DateTime64 columns.

This is becoming a blocker for users whose pipelines emit events with sub-second timestamps (e.g. log/event ingestion, financial data).

Proposed solution

  1. Schema model

    • Add TypeName.DATETIME64.
    • Extend ColumnType with a precision field (0–9) and an optional timezone string, plus a factory ColumnType.dateTime64(int precision, @Nullable String timezone).
  2. Parser

    • Add a grammar rule for DateTime64(<precision>[, '<timezone>']) to ColumnTypeParser.jj.
  3. Beam field type mapping

    • For precision <= 3, continue mapping to Schema.FieldType.DATETIME (Joda Instant, ms precision) for backward compatibility.
    • For precision > 3, map to a Beam logical type that preserves sub-second precision — SqlTypes.TIMESTAMP (java.time.Instant-backed, nanos) is the natural fit.
  4. Writer

    • In ClickHouseWriter, serialize DateTime64 as a little-endian Int64 containing epoch_seconds * 10^precision + sub_second_units, matching the ClickHouse native protocol.
  5. Tests

    • Round-trip tests for precisions 3, 6, 9 against a ClickHouse test container, plus parser unit tests covering DateTime64(3), DateTime64(6, 'UTC'), and the Nullable(DateTime64(...))
      wrapper.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Prism Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions