Skip to content

Result set Timestamp handling improvements#166

Merged
Mytherin merged 1 commit intoduckdb:mainfrom
staticlibs:result_set_timestamp
Mar 21, 2025
Merged

Result set Timestamp handling improvements#166
Mytherin merged 1 commit intoduckdb:mainfrom
staticlibs:result_set_timestamp

Conversation

@staticlibs
Copy link
Collaborator

DuckDB supports both TIMESTAMP WITHOUT TIME ZONE and TIMESTAMP WITH TIME ZONE. When gettting Java Timestamp from a result set, user can optionally request desired time zone by specifying a Calendar instance.

Thus we are getting the following invariants:

  1. TIMESTAMP WITH TIME ZONE, no Calendar: returns Timestamp in a system-default (JVM) time zone that represents the same moment in time as the DB's TIMESTAMP WITH TIME ZONE. For example with a system-default time zone being EET (UTC+2) the timestamp with 12:00 time and CET (UTC+1) DB time zone will be returned with 13:00 time (in EET).

  2. TIMESTAMP WITH TIME ZONE, user-specified Calendar: the same logic is used as in point 1., just the time-zone from the user-specified Calendar is used instead of the system-default time zone.

  3. TIMESTAMP WITHOUT TIME ZONE, no Calendar: returns the timestamp that has the same numeric values of all components as the DB's TIMESTAMP WITHOUT TIME ZONE. For example the timestamp with 12:00 time will be returned with the same 12:00 time in any system-defaut time zone.

  4. TIMESTAMP WITHOUT TIME ZONE, user-specified Calendar: this case is tricky because user is requesting to adjust the timestamp for the specified timezone, but DB's timestamp does not denote any time zone that can be used as a starting point for adjustment. In this case we are using the same logic as Postgres' JDBC driver:

https://github.com/pgjdbc/pgjdbc/blob/d86d7ce19030230f14e573b8095f04721d1d8e35/pgjdbc/src/main/java/org/postgresql/jdbc/TimestampUtils.java#L1606

So we assume that the incoming time stamp belongs to the system-default time zone. Then we calculating the offset between the system-default and the user-requested time zones that is effective at the moment of time defined by our timestamp in a system-default time zone. Note, that such offset can be different depending on the time of year due to DST. And then shifting the timestamp by this offset - this can seen as a "reverse" shift comparing to points 1. and 2.. For example, with a system-default time zone being EET (UTC+2) and user-requested time zone being CET (UTC+1), the timestamp with time 12:00 will be shifted by the difference between EET and CET and returned as 13:00 (in CET).

Additionally the logic added for getting date-only and time-only values from the TIMESTAMP_* fields is added. It is done by getting the Java Timestamp first (using the invariants described above) and then getting date-only or time-only part of this timestamp.

Testing: existing tests adjusted to match the new time zone handling logic; new test is added for getting Date and Time from TIMESTAMP.

Fixes: #146

DuckDB supports both `TIMESTAMP WITHOUT TIME ZONE` and
`TIMESTAMP WITH TIME ZONE`. When gettting Java `Timestamp` from a
result set, user can optionally request desired time zone by specifying
a `Calendar` instance.

Thus we are getting the following invariants:

1. `TIMESTAMP WITH TIME ZONE`, no `Calendar`: returns
`Timestamp` in a system-default (JVM) time zone that represents the
same moment in time as the DB's `TIMESTAMP WITH TIME ZONE`. For example
with a system-default time zone being `EET` (`UTC+2`) the timestamp with
`12:00` time and `CET` (`UTC+1`) DB time zone will be returned with
`13:00` time (in `EET`).

2. `TIMESTAMP WITH TIME ZONE`, user-specified Calendar: the same logic
is used as in point `1.`, just the time-zone from the user-specified
Calendar is used instead of the system-default time zone.

3. `TIMESTAMP WITHOUT TIME ZONE`, no `Calendar`: returns the timestamp
that has the same numeric values of all components as the DB's
`TIMESTAMP WITHOUT TIME ZONE`. For example the timestamp with `12:00`
time will be returned with the same `12:00` time in any system-defaut
time zone.

4. `TIMESTAMP WITHOUT TIME ZONE`, user-specified `Calendar`: this case
is tricky because user is requesting to adjust the timestamp for the
specified timezone, but DB's timestamp does not denote any time zone
that can be used as a starting point for adjustment. In this case we
are using the same logic as Postgres' JDBC driver:

https://github.com/pgjdbc/pgjdbc/blob/d86d7ce19030230f14e573b8095f04721d1d8e35/pgjdbc/src/main/java/org/postgresql/jdbc/TimestampUtils.java#L1606

So we assume that the incoming time stamp belongs to the
system-default time zone. Then we calculating the offset between the
system-default and the user-requested time zones that is effective at
the moment of time defined by our timestamp in a system-default time
zone. Note, that such offset can be different depending on the time of
year due to DST. And then shifting the timestamp by this offset - this
can seen as a "reverse" shift comparing to points `1.` and `2.`. For
example, with a system-default time zone being `EET` (`UTC+2`) and
user-requested time zone being `CET` (`UTC+1`), the timestamp with time
`12:00` will be shifted by the difference between `EET` and `CET` and
returned as `13:00` (in `CET`).

Additionally the logic added for getting date-only and time-only values
from the `TIMESTAMP_*` fields is added. It is done by getting the Java
`Timestamp` first (using the invariants described above) and then
getting date-only or time-only part of this timestamp.

Testing: existing tests adjusted to match the new time zone handling
logic; new test is added for getting `Date` and `Time` from `TIMESTAMP`.

Fixes: duckdb#146
@Mytherin
Copy link
Contributor

Thanks for the PR! Matching the logic of the Postgres driver makes sense to me.

TIMESTAMP WITH TIME ZONE, no Calendar: returns Timestamp in a system-default (JVM) time zone that represents the same moment in time as the DB's TIMESTAMP WITH TIME ZONE. For example with a system-default time zone being EET (UTC+2) the timestamp with 12:00 time and CET (UTC+1) DB time zone will be returned with 13:00 time (in EET).

This part is a bit unclear to me - TIMESTAMP WITH TIME ZONE is internally stored in DuckDB as a UTC offset - so returning that offset always returns a UTC time regardless of the user-specified timezone in the database (unless a string conversion is triggered inside DuckDB - which bins the timestamp to the user-specified timezone).

I think that behavior is fine and makes sense - but just figured I would mention this in case that was not clear.

@staticlibs
Copy link
Collaborator Author

Thanks for the review! Yes, the wording here is ambiguous, under "DB time zone" I mean the time zone offset that is already included with the DB's TIMESTAMP WITH TIMEZONE value, NOT the DB-level default time zone (e.g. set by user with SET TIMEZONE). In this case the timestamp value fetched from DB (that already includes the offset in its UTC value) is adjusted to the JVM-default time zone. DB-level default time zone does NOT affect the conversion in any of 4 cases.

@Mytherin
Copy link
Contributor

Alright - thanks for the clarification!

@Mytherin Mytherin merged commit 0086304 into duckdb:main Mar 21, 2025
7 checks passed
@staticlibs staticlibs deleted the result_set_timestamp branch March 21, 2025 15:17
staticlibs added a commit to staticlibs/duckdb-java that referenced this pull request Apr 18, 2025
This change implements a number of setter methods on
`PreparedStatement` that were previously throwing
`SQLFeatureNotSupportedException`.

`set*Stream` group of methods is implemented only for compatibility
with existing tools, no actual streaming is used on DB level - full
input is read to string/bytes before passing it to DB.

`execute`/`executeUpdate` methods for generated keys are implemented
only for cases when generated keys are not requested.

`setTime/Date/Timestamp` methods with `Calendar` (time zone) support
are implemented fully following the same approach used before in duckdb#166.
Also the logic for `setTime` method without `Calendar` is fixed to
correctly NOT use default JVM time zone.

Testing: new tests added to cover implemented methods; timestamp tests
are moved to separate file and time zone-specific tests are added.

Fixes: duckdb#195
staticlibs added a commit to staticlibs/duckdb-java that referenced this pull request Apr 22, 2025
This change implements a number of setter methods on
`PreparedStatement` that were previously throwing
`SQLFeatureNotSupportedException`.

`set*Stream` group of methods is implemented only for compatibility
with existing tools, no actual streaming is used on DB level - full
input is read to string/bytes before passing it to DB.

`execute`/`executeUpdate` methods for generated keys are implemented
only for cases when generated keys are not requested.

`setTime/Date/Timestamp` methods with `Calendar` (time zone) support
are implemented fully following the same approach used before in duckdb#166.
Also the logic for `setTime` method without `Calendar` is fixed to
correctly NOT use default JVM time zone.

Testing: new tests added to cover implemented methods; timestamp tests
are moved to separate file and time zone-specific tests are added.

Fixes: duckdb#195
staticlibs added a commit that referenced this pull request Apr 22, 2025
This change implements a number of setter methods on
`PreparedStatement` that were previously throwing
`SQLFeatureNotSupportedException`.

`set*Stream` group of methods is implemented only for compatibility
with existing tools, no actual streaming is used on DB level - full
input is read to string/bytes before passing it to DB.

`execute`/`executeUpdate` methods for generated keys are implemented
only for cases when generated keys are not requested.

`setTime/Date/Timestamp` methods with `Calendar` (time zone) support
are implemented fully following the same approach used before in #166.
Also the logic for `setTime` method without `Calendar` is fixed to
correctly NOT use default JVM time zone.

Testing: new tests added to cover implemented methods; timestamp tests
are moved to separate file and time zone-specific tests are added.

Fixes: #195
staticlibs added a commit to staticlibs/duckdb-odbc that referenced this pull request May 9, 2025
DuckDB supports `TIMESTAMP WITH TIME ZONE` data type that stores UTC
dates (with specified time zone already applied on DB insert). ODBC
does not have a notion of time zones. When client app reads such
`TIMESTAMP WITH TIME ZONE` it needs to be converted into
`TIMESTAMP WITHOUT TIME ZONE` with client-local time zone applied. And
such converted timestamp must represent the same moment of time as
stored UTC timestamp.

Applying client-local time zone to timestamp is a non-trivial operation
 - the UTC offset depends on both the time zone (for fixed shift) and on
the timestamp value (for DST shift).

There are two options to get the effective offset for the specified UTC
timestamp in the specified time zone:

 - DuckDB ICU extension, that will use DB-configured time zone
 - OS API, will use OS-configured time zone

Variant with ICU extension appeared to be problematic, as we would like
to not link neither to ICU extension (C++ API, not in DuckDB C API) nor
to ICU itself (C API exists, but symbols include ICU major version that
can change). We can do ICU call from ODBC by creating SQL string and
executing it. But this conversion must be performed for every value in
every row. And there no reasonbly-easy ways to cache the results. Thus
the idea of SQL calls was rejected.

Variant with OS API matches the approach used in JDBC (see
duckdb/duckdb-java#166) where JVM-configured time zone is used
(invariant 1 there).

The problem with accessing OS API is that `<ctime>` utilities are not
thread-safe until `C++20`. Thus Windows `<timezoneapi.h>` and POSIX
`<time.h>` API are used directly. And `<ctime>` approach is used in
tests for cross-checking.

Testing: new test added that fetches `TIMESTAMP WITH TIME ZONE` value
as `SQL_C_TYPE_TIMESTAMP` and checks that it is shifted correctly.
staticlibs added a commit to staticlibs/duckdb-odbc that referenced this pull request May 9, 2025
DuckDB supports `TIMESTAMP WITH TIME ZONE` data type that stores UTC
dates (with specified time zone already applied on DB insert). ODBC
does not have a notion of time zones. When client app reads such
`TIMESTAMP WITH TIME ZONE` it needs to be converted into
`TIMESTAMP WITHOUT TIME ZONE` with client-local time zone applied. And
such converted timestamp must represent the same moment of time as
stored UTC timestamp.

Applying client-local time zone to timestamp is a non-trivial operation
 - the UTC offset depends on both the time zone (for fixed shift) and on
the timestamp value (for DST shift).

There are two options to get the effective offset for the specified UTC
timestamp in the specified time zone:

 - DuckDB ICU extension, that will use DB-configured time zone
 - OS API, will use OS-configured time zone

Variant with ICU extension appeared to be problematic, as we would like
to not link neither to ICU extension (C++ API, not in DuckDB C API) nor
to ICU itself (C API exists, but symbols include ICU major version that
can change). We can do ICU call from ODBC by creating SQL string and
executing it. But this conversion must be performed for every value in
every row. And there no reasonbly-easy ways to cache the results. Thus
the idea of SQL calls was rejected.

Variant with OS API matches the approach used in JDBC (see
duckdb/duckdb-java#166) where JVM-configured time zone is used
(invariant 1 there).

The problem with accessing OS API is that `<ctime>` utilities are not
thread-safe until `C++20`. Thus Windows `<timezoneapi.h>` and POSIX
`<time.h>` API are used directly. And `<ctime>` approach is used in
tests for cross-checking.

Testing: new test added that fetches `TIMESTAMP WITH TIME ZONE` value
as `SQL_C_TYPE_TIMESTAMP` and checks that it is shifted correctly.
staticlibs added a commit to staticlibs/duckdb-odbc that referenced this pull request May 14, 2025
DuckDB supports `TIMESTAMP WITH TIME ZONE` data type that stores UTC
dates (with specified time zone already applied on DB insert). ODBC
does not have a notion of time zones. When client app reads such
`TIMESTAMP WITH TIME ZONE` it needs to be converted into
`TIMESTAMP WITHOUT TIME ZONE` with client-local time zone applied. And
such converted timestamp must represent the same moment of time as
stored UTC timestamp.

Applying client-local time zone to timestamp is a non-trivial operation
 - the UTC offset depends on both the time zone (for fixed shift) and on
the timestamp value (for DST shift).

There are two options to get the effective offset for the specified UTC
timestamp in the specified time zone:

 - DuckDB ICU extension, that will use DB-configured time zone
 - OS API, will use OS-configured time zone

Variant with ICU extension appeared to be problematic, as we would like
to not link neither to ICU extension (C++ API, not in DuckDB C API) nor
to ICU itself (C API exists, but symbols include ICU major version that
can change). We can do ICU call from ODBC by creating SQL string and
executing it. But this conversion must be performed for every value in
every row. And there no reasonbly-easy ways to cache the results. Thus
the idea of SQL calls was rejected.

Variant with OS API matches the approach used in JDBC (see
duckdb/duckdb-java#166) where JVM-configured time zone is used
(invariant 1 there).

The problem with accessing OS API is that `<ctime>` utilities are not
thread-safe until `C++20`. Thus Windows `<timezoneapi.h>` and POSIX
`<time.h>` API are used directly. And `<ctime>` approach is used in
tests for cross-checking.

Testing: new test added that fetches `TIMESTAMP WITH TIME ZONE` value
as `SQL_C_TYPE_TIMESTAMP` and checks that it is shifted correctly.
staticlibs added a commit to duckdb/duckdb-odbc that referenced this pull request May 22, 2025
DuckDB supports `TIMESTAMP WITH TIME ZONE` data type that stores UTC
dates (with specified time zone already applied on DB insert). ODBC
does not have a notion of time zones. When client app reads such
`TIMESTAMP WITH TIME ZONE` it needs to be converted into
`TIMESTAMP WITHOUT TIME ZONE` with client-local time zone applied. And
such converted timestamp must represent the same moment of time as
stored UTC timestamp.

Applying client-local time zone to timestamp is a non-trivial operation
 - the UTC offset depends on both the time zone (for fixed shift) and on
the timestamp value (for DST shift).

There are two options to get the effective offset for the specified UTC
timestamp in the specified time zone:

 - DuckDB ICU extension, that will use DB-configured time zone
 - OS API, will use OS-configured time zone

Variant with ICU extension appeared to be problematic, as we would like
to not link neither to ICU extension (C++ API, not in DuckDB C API) nor
to ICU itself (C API exists, but symbols include ICU major version that
can change). We can do ICU call from ODBC by creating SQL string and
executing it. But this conversion must be performed for every value in
every row. And there no reasonbly-easy ways to cache the results. Thus
the idea of SQL calls was rejected.

Variant with OS API matches the approach used in JDBC (see
duckdb/duckdb-java#166) where JVM-configured time zone is used
(invariant 1 there).

The problem with accessing OS API is that `<ctime>` utilities are not
thread-safe until `C++20`. Thus Windows `<timezoneapi.h>` and POSIX
`<time.h>` API are used directly. And `<ctime>` approach is used in
tests for cross-checking.

Testing: new test added that fetches `TIMESTAMP WITH TIME ZONE` value
as `SQL_C_TYPE_TIMESTAMP` and checks that it is shifted correctly.

Note: this commit is a re-do for 8dbf0ac that was added with incorrect
message. Original PR for it is #127.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NullPointerException when reading JDBC ResultSet.getDate() from a Timestamp column

2 participants