Skip to content

Fix OperationalError not caught during reconnect on dead connections#9

Open
terezbw wants to merge 2 commits into
masterfrom
fix/operational-error-reconnect
Open

Fix OperationalError not caught during reconnect on dead connections#9
terezbw wants to merge 2 commits into
masterfrom
fix/operational-error-reconnect

Conversation

@terezbw
Copy link
Copy Markdown
Contributor

@terezbw terezbw commented May 26, 2026

When a gunicorn worker is recycled, all its persistent PG connections are closed server-side. On the next request, create_cursor() catches InterfaceError and calls reconnect(). If the reconnect's connect() call itself fails (e.g. transient TCP timeout), the resulting OperationalError was not caught and propagated as a 500.

Two changes:

  • Also catch OperationalError from create_cursor() to handle zombie connections where psycopg2 has not yet detected the server-side close
  • If reconnect() → connect() fails with OperationalError, attempt one more fresh connect via ensure_connection() before giving up

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Maksim Kozin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@terezbw terezbw force-pushed the fix/operational-error-reconnect branch from 02882f9 to 45c237f Compare May 26, 2026 15:38
…ailures

  Previously create_cursor() caught only InterfaceError. When a persistent
  connection is closed server-side (e.g. during gunicorn worker recycling),
  the first cursor attempt raises InterfaceError and triggers reconnect().
  If the subsequent connect() call itself raises OperationalError, the
  exception escaped unhandled, causing a 500 response to the client.

  Changes:
  - create_cursor() now catches both InterfaceError and OperationalError
  - if reconnect() raises OperationalError, one additional attempt is made
    via ensure_connection() before giving up
  - added tests for both new paths
@terezbw terezbw force-pushed the fix/operational-error-reconnect branch from deb0ba4 to 97553ee Compare May 26, 2026 16:41
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@Hairash Hairash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope it'll help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants