Skip to content

chore(observe): drop dead pg_stat_activity scrape path#599

Merged
benben merged 1 commit into
mainfrom
ben/ducklake-metadata-observability-cleanup
May 21, 2026
Merged

chore(observe): drop dead pg_stat_activity scrape path#599
benben merged 1 commit into
mainfrom
ben/ducklake-metadata-observability-cleanup

Conversation

@benben
Copy link
Copy Markdown
Member

@benben benben commented May 21, 2026

Summary

Follow-up cleanup after #598. The postgres_query()-based pg_stat_activity scrape silently no-ops in our DuckDB build, leaving duckgres_ducklake_metadata_connections empty in dev and never reporting any data. Dead code is worse than no code here: it gives the misleading impression that "no data" means "no connections", instead of "this never worked".

Removed:

  • scrapeMetadataConnectionsByState and its postgres_query call in duckdbservice/service.go
  • MetadataConnectionsByState gauge in server/observe/metrics.go

Kept (works, confirmed wired):

  • duckgres_ducklake_metadata_pool_max_connections gauge — scraped via duckdb_settings(), which is universally available.
  • duckgres_postgres_scan_seconds{org} histogram and duckdb.postgres_scan_thread_s span attribute (per-query metadata DB time).
  • system.query_log.postgres_scan_ms column — QueryLogger is a real, wired feature (configstore singleton in controlplane/configstore/models.go:287, CLI flag, admin API, plumbed through cmd/duckgres-worker/main.go:214); the column just sits idle in deployments where QueryLog.Enabled=false and lights up automatically when the feature is flipped on.
  • application_name=duckgres/<org-id> injection — Aurora-side pg_stat_activity / Performance Insights are the better tool for live connection state anyway.

Test plan

  • go build ./...
  • go vet ./...
  • go test ./duckdbservice/ ./server/observe/

The postgres_query()-based pg_stat_activity scrape added in #598 silently
no-ops in our DuckDB build and never populates the
duckgres_ducklake_metadata_connections gauge. Remove it (and the gauge)
rather than leave dead code that masks a real "no data" signal.

Aurora-side connection / query activity is already attributable to
duckgres via the application_name we tag at ATTACH time, so Performance
Insights and pg_stat_activity are the better tools for that view —
no worker-side scrape needed.

Keeps the pg_pool_max_connections probe via duckdb_settings(), which
works, and the per-query postgres_scan_seconds histogram and
postgres_scan_ms query_log column from #598.
@benben benben merged commit d5c5fad into main May 21, 2026
22 checks passed
@benben benben deleted the ben/ducklake-metadata-observability-cleanup branch May 21, 2026 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant