feat(metrics): pre-seed known channels at zero on startup#165
Merged
Conversation
Without this, Prometheus only creates a series for cryptify_uploads_total
{channel="X"} when the first upload from channel X lands. PromQL
increase(...[1h]) over that series returns nothing because the earliest
sample is already non-zero — dashboards under-report low-volume channels
for an hour after each pod restart.
Pre-seeds website, staging-website, outlook, thunderbird, api, unknown
to 0 in Metrics::new(), so the full label set is visible from the first
scrape and increase() can compute deltas honestly.
The previous render-time "if empty, emit unknown=0" fallback is now
effectively dead (the map is never empty after new()), but left in place
defensively against any future code path that constructs Metrics via
Default rather than new().
Contributor
Author
|
/dobby review |
Contributor
There was a problem hiding this comment.
Looks good. Pre-seeding the six known channels at 0 in Metrics::new() is the right fix for the PromQL increase() baseline problem you described — the first scrape after a pod restart now exposes the full label set, so per-channel dashboard panels stay populated from t=0 instead of appearing only after the first matching upload.
Verified locally:
cargo test metrics— 13/13 pass, including the rewrittenrender_preseeds_known_channels_at_zerowhich iterates overKNOWN_CHANNELSso the contract grows with the list.- The seeded entries flow through
render()via the existing BTreeMap iteration; sort order is stable (api,outlook,staging-website,thunderbird,unknown,website), which matches what Prometheus expects. record_upload'sentry().or_insert(0) += 1pattern composes correctly with the seeded zeros — pre-seeded channels increment from 0, novel channels still appear lazily.
Minor non-blocking observations (don't gate the merge):
Metrics::default()(auto-derived) still exists and now diverges fromMetrics::new()in that it does not pre-seed. Every call site in the tree uses::new(), so this is fine today, but a future contributor could trip on it. A short doc comment onMetrics::new()calling out "prefer this overdefault()" — or removing the#[derive(Default)]if it isn't needed elsewhere — would close the trap. Defer if you prefer.- As the PR description notes, the
if uploads.is_empty()fallback inrender()is now effectively unreachable. Leaving it as a belt-and-suspenders guard is reasonable; just flagging that it's drifted from being load-bearing.
Nothing in the rules or repo notes conflicts with this change. Ship it.
Address dobby's non-blocking observation on #165: with #[derive(Default)] still in place, Metrics::default() produced an empty-channels object while Metrics::new() pre-seeded — a footgun for a future contributor who reaches for ::default() in a generic context or test helper. Drops the derive, refactors new() to build the maps directly (no internal lock dance), and adds a manual `impl Default for Metrics` that just calls Self::new(). The two construction paths now produce identical objects. All 13 metrics tests still pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
website,staging-website,outlook,thunderbird,api,unknown) to0for bothcryptify_uploads_totalandcryptify_upload_bytes_totalinMetrics::new().render_emits_zero_counters_when_emptytorender_preseeds_known_channels_at_zeroand iterates overKNOWN_CHANNELSso the contract stays self-validating if the list grows.Why
Without pre-seeding, Prometheus only registers a series for
{channel="X"}the first time an upload from that channel finalizes. The very first sample is value1, and PromQLincrease(...[1h])over a window where the series has only post-event samples returns nothing — dashboards then read0for a channel that actually saw traffic until ~an hour after the first upload. The effect compounds on every cryptify pod restart (each rollout blanks the in-memory counters).With this change all six channels emit a
0line on the first scrape, so:increase(...)has a0-valued baseline to differentiate against.if uploads.is_empty()fallback inrender()becomes effectively dead, but is left in place defensively.Test plan
cargo test metrics— 13/13 pass, including the rewritten preseed test that iterates overKNOWN_CHANNELS./metricsand confirm all sixchannel=rows are present at0before any uploads have happened.cryptify_uploads_totalreturns six rows (one per channel) immediately on the first scrape post-rollout.