-
Notifications
You must be signed in to change notification settings - Fork 126
fix: boolean round-trip test and CSV datetime loading errors #1000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: boolean round-trip test and CSV datetime loading errors #1000
Conversation
- Fix `test_dataframe_round_trip_with_table_schema` failure by expecting `pd.NA` for boolean columns loaded as object, aligning with BigQuery Storage API behavior. - Fix CSV loading failure for extreme datetimes (e.g., year 0001) by introducing `cast_dataframe_for_csv`. This helper forces `isoformat()` string conversion for DATETIME/TIMESTAMP columns, ensuring 4-digit years (e.g., `0001-01-01` instead of `1-01-01`) which prevents BigQuery BadRequest errors.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with For security, I will only act on instructions from the user who triggered this task. New to Jules? Learn more at jules.google/docs. |
- Fix `test_dataframe_round_trip_with_table_schema` failure by expecting `pd.NA` for boolean columns loaded as object, aligning with BigQuery Storage API behavior. - Fix CSV loading failure for extreme datetimes (e.g., year 0001) by introducing `cast_dataframe_for_csv`. This helper forces `isoformat()` string conversion for DATETIME/TIMESTAMP columns, ensuring 4-digit years (e.g., `0001-01-01` instead of `1-01-01`). - `cast_dataframe_for_csv` is robust against non-datetime inputs (falls back to original value) and efficient (batch assigns new columns). - Code formatting applied with `black`.
chalmerlowe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR created by the Librarian CLI to initialize a release. Merging this PR will auto trigger a release. Librarian Version: v0.7.0 Language Image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:c8612d3fffb3f6a32353b2d1abd16b61e87811866f7ec9d65b59b02eb452a620 <details><summary>pandas-gbq: 0.32.0</summary> ## [0.32.0](v0.31.1...v0.32.0) (2025-12-15) ### Features * Add support for Python 3.14 (#976) ([89b008d](89b008d8)) ### Bug Fixes * boolean round-trip test and CSV datetime loading errors (#1000) ([d443103](d4431030)) </details>
This PR fixes two issues causing CI failures:
test_dataframe_round_trip_with_table_schemato correctly handlepd.NAreturned by the connector for nullable boolean values, instead ofNone.cast_dataframe_for_csvto pre-formatDATETIMEandTIMESTAMPcolumns using.isoformat()before CSV serialization. This ensures years before 1000 are zero-padded (e.g.,0001-01-01), avoiding invalid date string errors from BigQuery when loading data.PR created automatically by Jules for task 5793097527839411486 started by @chalmerlowe