Skip to content

feat(metrics): pre-seed known channels at zero on startup#165

Merged
rubenhensen merged 2 commits into
mainfrom
feat/preseed-known-channels
May 18, 2026
Merged

feat(metrics): pre-seed known channels at zero on startup#165
rubenhensen merged 2 commits into
mainfrom
feat/preseed-known-channels

Conversation

@rubenhensen
Copy link
Copy Markdown
Contributor

Summary

  • Pre-seeds the six known channels (website, staging-website, outlook, thunderbird, api, unknown) to 0 for both cryptify_uploads_total and cryptify_upload_bytes_total in Metrics::new().
  • Renames the corresponding test from render_emits_zero_counters_when_empty to render_preseeds_known_channels_at_zero and iterates over KNOWN_CHANNELS so the contract stays self-validating if the list grows.

Why

Without pre-seeding, Prometheus only registers a series for {channel="X"} the first time an upload from that channel finalizes. The very first sample is value 1, and PromQL increase(...[1h]) over a window where the series has only post-event samples returns nothing — dashboards then read 0 for a channel that actually saw traffic until ~an hour after the first upload. The effect compounds on every cryptify pod restart (each rollout blanks the in-memory counters).

With this change all six channels emit a 0 line on the first scrape, so:

  • increase(...) has a 0-valued baseline to differentiate against.
  • Per-channel dashboard panels show every channel from the start, not just the ones that happen to have had traffic since the last rollout.
  • The if uploads.is_empty() fallback in render() becomes effectively dead, but is left in place defensively.

Test plan

  • cargo test metrics — 13/13 pass, including the rewritten preseed test that iterates over KNOWN_CHANNELS.
  • After deploy: scrape /metrics and confirm all six channel= rows are present at 0 before any uploads have happened.
  • In Cockpit Grafana Explore: cryptify_uploads_total returns six rows (one per channel) immediately on the first scrape post-rollout.

Without this, Prometheus only creates a series for cryptify_uploads_total
{channel="X"} when the first upload from channel X lands. PromQL
increase(...[1h]) over that series returns nothing because the earliest
sample is already non-zero — dashboards under-report low-volume channels
for an hour after each pod restart.

Pre-seeds website, staging-website, outlook, thunderbird, api, unknown
to 0 in Metrics::new(), so the full label set is visible from the first
scrape and increase() can compute deltas honestly.

The previous render-time "if empty, emit unknown=0" fallback is now
effectively dead (the map is never empty after new()), but left in place
defensively against any future code path that constructs Metrics via
Default rather than new().
@rubenhensen
Copy link
Copy Markdown
Contributor Author

/dobby review

Copy link
Copy Markdown
Contributor

@dobby-coder dobby-coder Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Pre-seeding the six known channels at 0 in Metrics::new() is the right fix for the PromQL increase() baseline problem you described — the first scrape after a pod restart now exposes the full label set, so per-channel dashboard panels stay populated from t=0 instead of appearing only after the first matching upload.

Verified locally:

  • cargo test metrics — 13/13 pass, including the rewritten render_preseeds_known_channels_at_zero which iterates over KNOWN_CHANNELS so the contract grows with the list.
  • The seeded entries flow through render() via the existing BTreeMap iteration; sort order is stable (api, outlook, staging-website, thunderbird, unknown, website), which matches what Prometheus expects.
  • record_upload's entry().or_insert(0) += 1 pattern composes correctly with the seeded zeros — pre-seeded channels increment from 0, novel channels still appear lazily.

Minor non-blocking observations (don't gate the merge):

  • Metrics::default() (auto-derived) still exists and now diverges from Metrics::new() in that it does not pre-seed. Every call site in the tree uses ::new(), so this is fine today, but a future contributor could trip on it. A short doc comment on Metrics::new() calling out "prefer this over default()" — or removing the #[derive(Default)] if it isn't needed elsewhere — would close the trap. Defer if you prefer.
  • As the PR description notes, the if uploads.is_empty() fallback in render() is now effectively unreachable. Leaving it as a belt-and-suspenders guard is reasonable; just flagging that it's drifted from being load-bearing.

Nothing in the rules or repo notes conflicts with this change. Ship it.

Address dobby's non-blocking observation on #165: with #[derive(Default)]
still in place, Metrics::default() produced an empty-channels object
while Metrics::new() pre-seeded — a footgun for a future contributor
who reaches for ::default() in a generic context or test helper.

Drops the derive, refactors new() to build the maps directly (no
internal lock dance), and adds a manual `impl Default for Metrics`
that just calls Self::new(). The two construction paths now produce
identical objects.

All 13 metrics tests still pass.
@rubenhensen rubenhensen merged commit 2af3ba0 into main May 18, 2026
7 checks passed
@rubenhensen rubenhensen deleted the feat/preseed-known-channels branch May 18, 2026 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant