Skip to content

Intermittent IAM Authentication Failure (SQLSTATE 08P01) when using Managed Connection Pooling (MCP) #2553

@yanivamram

Description

@yanivamram

Bug Description

We are experiencing intermittent authentication failures when connecting to a Cloud SQL PostgreSQL Enterprise Plus instance using the Cloud SQL Proxy as a GKE sidecar with --auto-iam-authn.

The issue specifically appears when Managed Connection Pooling (MCP) is enabled. After a period of stability, multiple backend services simultaneously fail with the following error:
FATAL: Cloud SQL IAM service account authentication failed for user "..." (SQLSTATE 08P01)

Environment

  • Proxy Version: 2.21
  • Database: PostgreSQL Enterprise Plus (v14)
  • Environment: GKE with Proxy Sidecar
  • Authentication: IAM with --auto-iam-authn
  • Feature: Managed Connection Pooling (MCP) Enabled

Observations
Token Expiration: Google Support suggests the root cause is IAM token expiration (approx. 1 hour). When MCP is active, pooled connections appear to hold onto expired tokens, leading to failures on subsequent queries.

Frequency: The issue recurs every 1–2 days.

Temporary Resolution: Disabling MCP resolves the SQLSTATE 08P01 error but leads to max_connections exhaustion due to the lack of pooling.

Mitigation Attempts: We were advised to upgrade to v2.21+ and aggressively recycle connections (every 1-10 minutes), which suggests the Proxy or MCP is not handling token refreshes for long-lived pooled connections as expected.

Expected Behavior
The Cloud SQL Proxy (or the MCP integration) should transparently refresh IAM credentials/tokens so that long-lived connections in a managed pool do not fail with authentication errors.

Questions for Maintainers

  • Is there a known incompatibility between MCP and the Proxy's IAM refresh logic?
  • Should --auto-iam-authn handle token rotation automatically for connections managed by the server-side MCP, or is client-side connection recycling (e.g., max_connection_lifetime) strictly required by the user?

Metadata

Metadata

Assignees

Labels

type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions