feat(payload): improve BoundedProbability ergonomics for adoption#1876
feat(payload): improve BoundedProbability ergonomics for adoption#1876goxberry wants to merge 8 commits into
Conversation
Hand-write the comparison and hashing traits, sound under `try_new`'s invariants (no NaN, no `-0.0`). `PartialEq`/`PartialOrd`/`Ord` use `f32` native operators (`==`, `total_cmp`); `Hash` uses the `u32` bit pattern because `f32: !Hash`. The `Eq`/`Hash` contract holds because numerically equal valid values share a bit pattern. Add unit and property tests covering equality, ordering, hashing in a `HashSet`, and agreement between `Ord::cmp` and `f32::partial_cmp` over the full valid range. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rename the generic struct to `BoundedProbability<MIN_AS_BITS>` and expose two type aliases for the bounds that occur in lading payload config: `Probability` for $[0.0, 1.0]$ and `AtLeastOneTenth` for $[0.1, 1.0]$. Callers adopting the type no longer have to spell `f32::to_bits(...)` at each use site, which is the main blocker for wiring the type into existing `f32` fields (`dogstatsd::Config`, `opentelemetry::trace::Config`, etc.). ProbabilityError keeps its name since Probability is now the alias users typically construct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add an arbitrary::Arbitrary impl gated on the existing `arbitrary` feature so types containing a Probability field can continue to derive Arbitrary once the type is wired into Config structs. A derived impl would call f32::arbitrary and emit NaN, infinity, and out-of-range values, so a manual impl is required. The impl samples a u32 in [MIN_AS_BITS, f32::to_bits(+1.0)] and decodes it with f32::from_bits. The f32 <-> u32 bit-ordering monotonicity that is already documented on the type guarantees the decoded value lies in [MIN, +1.0] and cannot be -0.0 (whose bit pattern 0x8000_0000 is well above 0x3f80_0000). Routing the result through try_new fires the per-monomorphization const-eval bound check and forwards any future invariant added to the constructor; the expect is sound by the argument above. Add three property tests, one per existing bound, gated on the same feature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Probabilities in lading payload code exist to drive Bernoulli trials
("flip a coin biased by p"). Without a helper, every adopting call site
would extract the inner f32 via get() and pass it to rng.random_bool(p
as f64) by hand, which both leaks the f32 representation and recomputes
the same f32 -> f64 conversion at each use.
Add sample_bernoulli(&self, &mut rng) -> bool that delegates to
rng.random_bool. The f32 -> f64 conversion is exact for every f32 in
[+0.0, +1.0], so the success probability is preserved bit-for-bit, and
try_new's range invariant means random_bool never panics on
out-of-range input.
Add two property tests over u64 seeds, one for p = 0.0 and one for
p = 1.0. Together they pin the wiring (P(true) is not inverted) and
exercise the boundary values that random_bool is most likely to special-
case. A statistical test on intermediate p would only re-test
rand::Rng::random_bool itself, so it is omitted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Introduces `AtLeastOneHundredth = BoundedProbability<{ f32::to_bits(0.01) }>`
for fields such as `unique_tag_ratio` whose existing 0.01 floor must admit
in-the-wild values below 0.1. Also updates the `AtLeastOneTenth` rustdoc to
drop the `unique_tag_ratio` reference, since that field will be wired to
`AtLeastOneHundredth` rather than tightened to a 0.1 floor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 21da2305c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// Sample a Bernoulli trial with success probability `self.get()`. | ||
| /// | ||
| /// Returns `true` with probability `self.get()` and `false` otherwise. | ||
| /// The f32 ↔ f64 conversion is exact for every f32 in `[+0.0, +1.0]`, so |
There was a problem hiding this comment.
Replace non-ASCII documentation characters
AGENTS.md requires code and documentation to be US-ASCII only, but this added doc comment introduces a Unicode arrow; the same change adds another arrow below and an em dash in a test comment. Please replace the new non-ASCII characters with ASCII equivalents such as <-> and -- so the file continues to satisfy the documented repository standard.
Useful? React with 👍 / 👎.
AGENTS.md requires US-ASCII only in code and documentation. Replace two left-right arrows with `<->` and an em dash with `--`.
BoundedProbability rollout planWire the Decisions locked in
Phase 0 — Land the existing branchOn
Then Phase 1 — Stacked PRs (one per field)Per-field recipe
Hot-path call sites that stay
|
| PR | Field | Alias | Notes |
|---|---|---|---|
| 1 | TimestampConfig::probability |
Probability |
Self-contained struct with its own valid(). Many test fixtures in dogstatsd.rs construct TimestampConfig literally. |
| 2 | ValueConf::float_probability |
Probability |
Threads into NumValueGenerator. |
| 3 | Config::sampling_probability |
Probability |
Threads through MemberGenerator::new -> MetricGenerator::new. |
| 4 | Config::multivalue_pack_probability |
Probability |
Same path as PR 3. |
| 5 | Config::unique_tag_ratio |
AtLeastOneHundredth |
Threads through MemberGenerator::new -> common::tags::Generator::new (also dogstatsd::common::tags::Generator::new wrapper and opentelemetry::common::TagGenerator::new). The opentelemetry::common::UNIQUE_TAG_RATIO: f32 = 0.75 const becomes a const AtLeastOneHundredth via match try_new(0.75). Highest blast radius — tag tests reference the existing const bounds. |
Efficiency verification
After each Phase 1 PR, run the relevant criterion benches and compare against
the prior commit:
- PRs 1–4:
cargo bench -p lading_payload --bench dogstatsd(andblock). - PR 5: also OTel benches (
opentelemetry_log,opentelemetry_metric,
opentelemetry_traces).
Expected delta is noise — no allocation change, no new branch, no extra
function call (the .get() inlines away).
Out of scope
- No change to YAML/JSON wire format.
BoundedProbabilitycontinues to
serialize as a baref32viaserde(into = "f32", try_from = "f32"). - No swap of
OpenClosed01 < pforsample_bernoullianywhere. - No change to the existing
common::tags::MIN_UNIQUE_TAG_RATIO,
WARN_UNIQUE_TAG_RATIO, orMAX_UNIQUE_TAG_RATIOconstants — kept for
backward compatibility and reuse in tests.
| /// Generate a uniformly-distributed-over-bit-patterns value in `[MIN, +1.0]` | ||
| /// by sampling a `u32` in `[MIN_AS_BITS, f32::to_bits(+1.0)]` and decoding it. | ||
| /// | ||
| /// This works because the f32 <-> u32 ordering (documented on the type) is | ||
| /// monotonic for non-negative finite values, so every bit pattern in that | ||
| /// range decodes to a valid stored value. `-0.0`'s bit pattern is | ||
| /// `0x8000_0000`, far above `f32::to_bits(+1.0) = 0x3f80_0000`, so it can | ||
| /// never be generated. | ||
| #[cfg(feature = "arbitrary")] | ||
| impl<'a, const MIN_AS_BITS: u32> arbitrary::Arbitrary<'a> for BoundedProbability<MIN_AS_BITS> { | ||
| fn arbitrary(u: &mut arbitrary::Unstructured<'a>) -> arbitrary::Result<Self> { | ||
| let bits = u.int_in_range(MIN_AS_BITS..=f32::to_bits(Self::MAX))?; | ||
| let value = f32::from_bits(bits); | ||
| // Routing through `try_new` fires the per-monomorphization const-eval | ||
| // bound check on `Self::MIN` and forwards any future invariant added | ||
| // to the constructor. The `expect` is safe by the argument above. | ||
| Ok(Self::try_new(value).expect("bits in [MIN_AS_BITS, MAX_AS_BITS] always valid")) | ||
| } |
There was a problem hiding this comment.
I don't think this trait implementation would actually get used unless we were fuzzing this type for some reason. I can't think of a scenario where we would actually do that. I think this block of code could be deleted without any real consequence.
Drop `impl Display`, `impl Eq`, `impl Ord`, `impl PartialOrd`, `impl Hash`, `sample_bernoulli`, and the `AtLeastOneTenth` alias from `BoundedProbability`. None of these are reached by production code at the tip of the adoption stack: callers only need `PartialEq` (transitively via derives on `dogstatsd::Config`, `ValueConf`, `TimestampConfig`), `Arbitrary` (transitively via fuzz target derives), `TryFrom`/`From` (required by serde), `try_new`/`get`, and the `Probability` and `AtLeastOneHundredth` aliases. Drop the now-dead tests covering the removed items. `cargo test -p lading-payload` passes 239 tests; clippy and the `arbitrary` feature build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove unused
|
`cargo fmt --check` flagged a stray blank line before the closing brace of `mod probability_tests` introduced by the prior cleanup commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

What does this PR do?
Prepares
BoundedProbability(added in #1874) to replace baref32probability fields throughout
lading_payloadconfigs. The PR is asequence of independently-reviewable commits that add the surface the
adoption stack actually needs, plus a final cleanup commit that drops
the items the adoption PRs turned out not to use:
impl PartialEq/Eq/PartialOrd/Ord/Hash— hand-written, sound undertry_new's invariants (no NaN, no-0.0).PartialEq/Ordusef32operators (
==,total_cmp);Hashuses theu32bit pattern becausef32: !Hash. TheEq/Hashcontract holds because numerically equalvalid values share a bit pattern.
BoundedProbability<MIN_AS_BITS>and addProbability([0.0, 1.0])and
AtLeastOneTenth([0.1, 1.0]) so callers don't have to spellf32::to_bits(...)at each use site.impl arbitrary::Arbitrary(feature-gated) — needed so structscontaining a
Probabilityfield can keep derivingArbitraryoncewired in. A derived impl would emit NaN / infinity / out-of-range
values; this one samples a
u32in[MIN_AS_BITS, to_bits(+1.0)]andrelies on the documented
f32<->u32monotonicity to guaranteevalidity.
sample_bernoullihelper — replaces therng.random_bool(p.get() as f64)pattern that every adopting call sitewould otherwise repeat. The
f32->f64conversion is exact on[0.0, 1.0], so the success probability is preserved bit-for-bit.AtLeastOneHundredthalias —BoundedProbability<{ f32::to_bits(0.01) }>for
unique_tag_ratio's existing 0.01 floor, which must admitin-the-wild values below 0.1. Also drops the
unique_tag_ratioreference from the
AtLeastOneTenthrustdoc.not exercise:
impl Display(originally from payload config: add new Probability<const u32: MIN_AS_BITS> type #1874),impl Eq,impl Ord,impl PartialOrd,impl Hash,sample_bernoulli, andthe
AtLeastOneTenthalias. Surviving surface:PartialEq(transitively required by
dogstatsd::Config/ValueConf/TimestampConfigderiving
PartialEq),Arbitrary(transitively required by fuzztargets that derive
ArbitraryonConfig),TryFrom<f32>/From<BoundedProbability<_>> for f32(required byserde(into = "f32", try_from = "f32")),try_new,get, and theProbabilityandAtLeastOneHundredthaliases. The dead tests arepruned alongside.
Each non-cleanup commit ships with unit and property tests; the cleanup
commit removes the now-dead tests covering removed items.
Motivation
BoundedProbabilityexists but isn't yet usable inConfigstructs:adopters need
PartialEqto keep deriving that trait,Arbitraryforfuzz harnesses, and named aliases instead of bit-pattern incantations.
Commits 1–5 build out a wider surface area than needed; commit 6 trims
back to what the rest of the stack actually calls. The history is
preserved so the rationale for each piece is reviewable, with the
cleanup commit acting as a journaled "what stuck" record.
Related issues
Follow-up to #1874. Foundation for the adoption stack:
TimestampConfig::probabilityValueConf::float_probabilityConfig::sampling_probabilityConfig::multivalue_pack_probabilityConfig::unique_tag_ratio(AtLeastOneHundredth)Additional Notes
No wire-format change —
BoundedProbabilitystill serializes as a baref32viaserde(into = "f32", try_from = "f32"). Phase 1 PRs in thestack switch one
Configfield per PR with no behavioral change atruntime (
.get()inlines to the same machine code as directf32access, and no sampler is swapped).