Bulk COPY insert - tickets/INSTRM-2821 #97

wtgee · 2025-12-24T19:21:52Z

Add bulk COPY command for insert dataframe

For inserting 65k rows into cobra_target, this, offers a +10x speedup.

Copilot

Pull request overview

This PR implements bulk insert optimization using PostgreSQL's COPY command, providing significant performance improvements (claimed 10x speedup for 65k rows) over the previous multi-row INSERT approach.

Key Changes:

Added new psql_insert_copy function that uses PostgreSQL's COPY command for bulk data loading
Added use_copy parameter (default True) to insert_dataframe method to enable/disable COPY optimization
Added performance timing to track insert operations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/pfs/utils/database/db.py

CraigLoomis

I'll suggest making that WITH (FORMAT csv, HEADER MATCH) (and whatever the equivalent of df.to_csv(...., header=True) is to ward against the worst mistakes.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

wtgee · 2025-12-24T19:57:49Z

I'll suggest making that WITH (FORMAT csv, HEADER MATCH) (and whatever the equivalent of df.to_csv(...., header=True) is to ward against the worst mistakes.

Since this is coming from the to_sql command itself, it has the keys parameter that specifies the exact column, so it should be fine to rely on the specific HEADER FALSE and the ordering of the keys.

I've added some explicit parameters to the csv writer and some other dataframe scrubbing checks that shouldn't interfere with data. It's actually running even faster with these explicit parameters since I guess it doesn't have to do an initial pass or conversions.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/pfs/utils/database/db.py

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

* Add bulk COPY command for insert dataframe * Scrub the dataframe before inserting.

wtgee requested a review from Copilot December 24, 2025 19:21

Copilot started reviewing on behalf of wtgee December 24, 2025 19:22 View session

Copilot AI reviewed Dec 24, 2025

View reviewed changes

CraigLoomis reviewed Dec 24, 2025

View reviewed changes

Copilot AI reviewed Dec 24, 2025

View reviewed changes

wtgee force-pushed the tickets/INSTRM-2821 branch 2 times, most recently from c9f780b to d05bc09 Compare December 24, 2025 19:54

wtgee requested a review from Copilot December 24, 2025 19:57

Copilot started reviewing on behalf of wtgee December 24, 2025 19:58 View session

Copilot AI reviewed Dec 24, 2025

View reviewed changes

wtgee force-pushed the tickets/INSTRM-2821 branch from d05bc09 to c96de62 Compare December 24, 2025 20:12

Copilot AI reviewed Dec 24, 2025

View reviewed changes

wtgee force-pushed the tickets/INSTRM-2821 branch 2 times, most recently from 15b0653 to 13579a7 Compare December 24, 2025 20:18

Bulk COPY insert - tickets/INSTRM-2821

2025a74

* Add bulk COPY command for insert dataframe * Scrub the dataframe before inserting.

wtgee force-pushed the tickets/INSTRM-2821 branch from 13579a7 to 2025a74 Compare December 24, 2025 21:07

Bulk COPY insert - tickets/INSTRM-2821 #97

Are you sure you want to change the base?

Bulk COPY insert - tickets/INSTRM-2821 #97

Conversation

wtgee commented Dec 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CraigLoomis left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

wtgee commented Dec 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants