Core: Replace per-table has_table N+1 with single get_table_names on …#2472
Core: Replace per-table has_table N+1 with single get_table_names on …#2472Tschuppi81 wants to merge 4 commits into
Conversation
…schema init ensure_schema_exists called metadata.create_all() which uses SQLAlchemy's has_table() to check each table individually before creating it — one pg_catalog.pg_class query per model. With ~152 tables registered in a typical app this produced 152 round trips on every cold schema init (first request after a server restart). Replace with a single Inspector.get_table_names() call to fetch all existing table names in one query, then pass only missing tables to create_all(..., checkfirst=False). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
When self.bases contains duplicate or overlapping entries, tables created in a first iteration were not reflected in existing_tables for subsequent ones, causing DuplicateTable errors with checkfirst=False. Update existing_tables after each create_all call to track what was just created. Also remove an accidental duplicate Base append in test_unique_column_value_validator that was previously masked by checkfirst=True. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Daverball
left a comment
There was a problem hiding this comment.
This seems like the kind of optimization SQLAlchemy should be doing itself inside Metadata.create_all(). It might be worth opening an issue on their bug tracker.
I don't know if I'm comfortable with doing this optimization in our code, especially since it only matters the first time a schema is accessed, so we don't really gain any substantial benefits during normal operation of the application. It might slightly speed up some CLI commands, but probably not to a noticeable degree, since the biggest overhead there is podman spinning up an extra instance of the onegov container and the initial loading of all the python modules.
Core: Replace N+1 has_table queries with single get_table_names on schema init
TYPE: Performance
LINK: None
https://seantis-gmbh.sentry.io/issues/7472765445/
https://seantis-gmbh.sentry.io/issues/7357002739/
...
The fix fetches all existing table names in the schema with a single Inspector.get_table_names() call, then passes only the missing tables to create_all(..., checkfirst=False), reducing from O(n) to O(1)