Skip to content

feat(payload): adopt AtLeastOneHundredth for Config::unique_tag_ratio#1881

Draft
goxberry wants to merge 1 commit into
goxberry/probability-multivalue-pack-probfrom
goxberry/probability-unique-tag-ratio
Draft

feat(payload): adopt AtLeastOneHundredth for Config::unique_tag_ratio#1881
goxberry wants to merge 1 commit into
goxberry/probability-multivalue-pack-probfrom
goxberry/probability-unique-tag-ratio

Conversation

@goxberry
Copy link
Copy Markdown
Contributor

@goxberry goxberry commented May 21, 2026

What does this PR do?

Change the public dogstatsd::Config::unique_tag_ratio field from f32 to
the AtLeastOneHundredth alias of BoundedProbability<{ f32::to_bits(0.01) }>.
The try_from impl enforces the finite + [0.01, 1.0] invariant at
deserialize time, so the redundant MIN_UNIQUE_TAG_RATIO..=MAX_UNIQUE_TAG_RATIO
range check in common::tags::Generator::new is removed. The WARN-level
check for values in [MIN, WARN_UNIQUE_TAG_RATIO] is preserved.

The new type threads through MemberGenerator::new,
common::tags::Generator::new, dogstatsd::common::tags::Generator::new,
and opentelemetry::common::TagGenerator::new. The OTel UNIQUE_TAG_RATIO
constant becomes a const AtLeastOneHundredth via a match on try_new,
with unreachable!() in the Err arm (the bound is provably valid at
const-eval time). MAX_UNIQUE_TAG_RATIO, no longer referenced outside
tests, is now #[cfg(test)].

At the comparison site in <common::tags::Generator as Generator>::generate,
.get() extracts the inner f32 so RNG sequences and bit-exact output are
preserved.

Motivation

Fifth and final per-field PR in Phase 1 of the BoundedProbability rollout.
Pushes range validation to deserialize time once and removes the runtime
range check from the tag-generator constructor. AtLeastOneHundredth is
chosen over Probability to preserve the existing 0.01 floor and over
AtLeastOneTenth so that in-the-wild configs in [0.01, 0.10) keep
deserializing.

Verification

cargo test -p lading-payload: 250 passed, 0 failed.

cargo clippy -p lading-payload --all-targets -- -D warnings: clean.

Criterion comparison vs. parent (goxberry/probability-multivalue-pack-prob)
for cargo bench -p lading-payload --bench dogstatsd:

Bench Median Δ Criterion verdict
dogstatsd_setup +1.59% within noise
dogstatsd_throughput/1MiB +2.27% flagged regressed
dogstatsd_throughput/10MiB +0.83% no change
dogstatsd_throughput/100MiB +1.17% within noise
dogstatsd_throughput/1GiB -0.88% no change

.get() is const fn returning a copied f32, so the IR at the comparison
site is bit-identical. The 1 MiB regression is most plausibly machine-state
noise (the bench was run right after a CPU-heavy OTel job had warmed the
system); the larger sizes do not reproduce it. OTel benches were skipped
because the OTel hot path is the same common::tags::Generator::generate
already exercised by dogstatsd, and the OTel call sites only pass
UNIQUE_TAG_RATIO to TagGenerator::new (init time, not hot path).

Related issues

Phase 1 of .claude/probability-type-rollout-plan.md, building on:

Additional Notes

Wire format is unchanged: BoundedProbability continues to round-trip as a
bare f32 via serde(into = "f32", try_from = "f32"). The
MIN_UNIQUE_TAG_RATIO and WARN_UNIQUE_TAG_RATIO constants are retained
because the WARN-level check still references them.

Copy link
Copy Markdown
Contributor Author

@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 21, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 3 Pipeline jobs failed

Continuous integration | Rust Actions (Check/Fmt/Clippy) (macos-latest, fmt)   View in Datadog   GitHub Actions

🔧 Fix in code (Fix with Cursor). Compilation error in dogstatsd.rs:278: expected one statement but found multiple.

Continuous integration | Rust Actions (Check/Fmt/Clippy) (ubuntu-latest, fmt)   View in Datadog   GitHub Actions

🔧 Fix in code (Fix with Cursor). Formatting errors detected in dogstatsd.rs and common.rs. Please fix the formatting differences.

Changelog Check | changelog-check   View in Datadog   GitHub Actions

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. No changes to CHANGELOG.md detected. Add 'no-changelog' label if this is intentional.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 4e9ba85 | Docs | Datadog PR Page | Give us feedback!

@goxberry
Copy link
Copy Markdown
Contributor Author

PR 5: Adopt AtLeastOneHundredth for Config::unique_tag_ratio

Context

This is the fifth and final per-field PR in Phase 1 of
.claude/probability-type-rollout-plan.md. PRs 1–4 already wired
Probability into TimestampConfig::probability, ValueConf::float_probability,
Config::sampling_probability, and Config::multivalue_pack_probability.

The remaining f32 probability/ratio in lading_payload is
dogstatsd::Config::unique_tag_ratio. This PR changes its type from f32 to
AtLeastOneHundredth (= BoundedProbability<{ f32::to_bits(0.01) }>) and
threads the typed value down through the tag generator constructors so that
range validation happens once at deserialize time rather than at
Generator::new time.

AtLeastOneHundredth is used instead of Probability because the existing
runtime check enforces a floor of 0.01 (MIN_UNIQUE_TAG_RATIO), and using
AtLeastOneTenth would tighten the bound and reject in-the-wild configs in
[0.01, 0.10). The decision is locked in .claude/probability-type-rollout-plan.md
("Decisions locked in").

Branch: goxberry/probability-unique-tag-ratio, stacked on top of
goxberry/probability-multivalue-pack-prob via gt.

Assumptions

  • AtLeastOneHundredth is already defined in lading_payload/src/common/config.rs:184.
  • BoundedProbability::try_new is const fn, so the OTel
    UNIQUE_TAG_RATIO: f32 = 0.75 const can become const AtLeastOneHundredth
    via match AtLeastOneHundredth::try_new(0.75).
  • The wire format is unchanged: BoundedProbability round-trips as a bare
    f32 via serde(into = "f32", try_from = "f32").
  • The MIN_UNIQUE_TAG_RATIO, WARN_UNIQUE_TAG_RATIO, and
    MAX_UNIQUE_TAG_RATIO constants stay (used by tests and the warn-check;
    the rollout plan's "Out of scope" section pins this).
  • Hot-path bit-exactness is preserved by calling .get() at the comparison
    site, exactly as PRs 1–4 did.
  • Test code currently passes raw f32 values to Generator::new; the
    minimal change is to wrap each call with AtLeastOneHundredth::try_new(x).expect(...).

Files to modify

  1. lading_payload/src/dogstatsd.rs

    • Import: add AtLeastOneHundredth next to Probability in the
      crate::common::config use list.
    • Config::unique_tag_ratio field: f32AtLeastOneHundredth. Update
      the rustdoc to drop the "between 0.10 and 1.0" claim (the type enforces
      [0.01, 1.0]).
    • Default for Config: replace unique_tag_ratio: 0.11 with
      unique_tag_ratio: AtLeastOneHundredth::try_new(0.11).expect("0.11 is in [0.01, 1.0]").
    • MemberGenerator::new: change the unique_tag_ratio: f32 parameter to
      AtLeastOneHundredth. The value is forwarded unchanged to
      tags::Generator::new.
  2. lading_payload/src/common/tags.rs

    • Import: use crate::common::config::{AtLeastOneHundredth, ConfRange};.
    • Generator::unique_tag_probability field: f32AtLeastOneHundredth.
    • Generator::new signature: parameter unique_tag_probability: f32
      AtLeastOneHundredth.
    • Remove the now-redundant MIN_UNIQUE_TAG_RATIO..=MAX_UNIQUE_TAG_RATIO
      range check (lines 191–195). The type enforces [0.01, 1.0] at
      deserialize time.
    • The WARN check (lines 197–201) keeps its current behavior. Call
      .get() on the parameter for the range membership test:
      (MIN_UNIQUE_TAG_RATIO..=WARN_UNIQUE_TAG_RATIO).contains(&unique_tag_probability.get()).
    • Hot path at line 277:
      let should_reuse = choose_existing_prob > self.unique_tag_probability.get();
    • Update rustdoc on Generator and Generator::new to drop stale
      "between 0.10 and 1.0" prose; the type encodes the lower bound.
    • Tests in this file pass raw f32 to Generator::new. Wrap the constant
      literal 1.0 calls (lines 334, 373) and the proptest-generated
      unique_tag_ratio (line 423) with AtLeastOneHundredth::try_new(x).expect(...).
  3. lading_payload/src/dogstatsd/common/tags.rs

    • Import: add AtLeastOneHundredth next to PoolKind / tags.
    • Generator::new signature: unique_tag_probability: f32
      AtLeastOneHundredth. Forward to inner tags::Generator::new unchanged.
    • Tests pass raw f32 to inner tags::Generator::new. Wrap the literal
      1.0 calls (lines 181, 227, 266, 351) and the proptest-generated
      unique_tag_ratio (lines 306, 400) with
      AtLeastOneHundredth::try_new(x).expect(...).
  4. lading_payload/src/opentelemetry/common.rs

    • Import: use crate::common::config::{AtLeastOneHundredth, ConfRange};.
    • UNIQUE_TAG_RATIO constant: f32 = 0.75AtLeastOneHundredth = match AtLeastOneHundredth::try_new(0.75) { Ok(p) => p, Err(_) => panic!(...) }.
    • TagGenerator::new signature: unique_tag_probability: f32
      AtLeastOneHundredth. Forward to inner tags::Generator::new unchanged.
  5. lading_payload/src/opentelemetry/log/templates.rs and
    lading_payload/src/opentelemetry/metric/templates.rs

    • No source changes — call sites already pass UNIQUE_TAG_RATIO, whose
      type is updated centrally.

Hot-path call site (one site for this PR)

Per the rollout plan's table, the only hot-path comparison for this field is:

  • lading_payload/src/common/tags.rs:277 — change
    choose_existing_prob > self.unique_tag_probability to
    choose_existing_prob > self.unique_tag_probability.get().

.get() is const fn returning a copied f32, so the generated code is
identical and RNG sequences remain bit-exact.

Out of scope (explicitly)

  • No YAML/JSON wire-format change.
  • No substitution of OpenClosed01 < p with sample_bernoulli.
  • No removal of MIN_UNIQUE_TAG_RATIO, WARN_UNIQUE_TAG_RATIO, or
    MAX_UNIQUE_TAG_RATIO.
  • No reordering or "tidy-up" of adjacent code.

Verification

  1. Type check + lints: ci/scripts/check-features.sh or
    cargo check -p lading_payload --all-features and
    cargo clippy -p lading_payload --all-features -- -D warnings.
  2. Unit + proptest suite for the affected crate:
    cargo test -p lading_payload. This re-runs the proptests in
    common::tags and dogstatsd::common::tags that vary
    unique_tag_ratio over [WARN, MAX), which exercises the
    wrap-into-AtLeastOneHundredth call paths in tests.
  3. Bit-exact byte output for default config: confirm by running
    cargo bench -p lading_payload --bench dogstatsd once before and once
    after the change. Expected delta is noise — no allocations added, no
    new branches, .get() inlines away. Capture the criterion summary in
    the PR description.
  4. OTel benches per the rollout plan's "Efficiency verification" for PR 5:
    cargo bench -p lading_payload --bench opentelemetry_log,
    --bench opentelemetry_metric, --bench opentelemetry_traces. Same
    expectation: noise-level delta.
  5. Deserialize-time rejection: a runtime sanity check via
    serde_yaml::from_str::<Config>("...unique_tag_ratio: 0.005\n...")
    should now fail with a "below lower bound 0.01" error rather than
    succeeding and then failing later in Generator::new. (Optionally
    captured as a new unit test; matches the pattern already added in PR 1
    for timestamp.probability at dogstatsd.rs:1003.)

Commit / submit

  • Commit message follows the established pattern of PRs 1–4:
    feat(payload): adopt AtLeastOneHundredth for Config::unique_tag_ratio
    with a short body explaining the type change, the threaded constructors,
    the removed redundant range check, and the bit-exact .get() hot-path
    preservation. Include the Co-Authored-By trailer.
  • gt create --all to branch from goxberry/probability-multivalue-pack-prob
    then gt submit to push and open the stacked PR (matches the existing
    Graphite stack already initialized with main as trunk).

@goxberry goxberry force-pushed the goxberry/probability-unique-tag-ratio branch from 177b73a to af42547 Compare May 21, 2026 05:44
@goxberry goxberry force-pushed the goxberry/probability-multivalue-pack-prob branch 2 times, most recently from 52e6ac8 to 24aaa41 Compare May 21, 2026 06:34
@goxberry goxberry force-pushed the goxberry/probability-unique-tag-ratio branch from af42547 to 123b9cc Compare May 21, 2026 06:34
Change the public `Config::unique_tag_ratio` field from `f32` to the
`AtLeastOneHundredth` alias of `BoundedProbability<{ f32::to_bits(0.01) }>`.
The `try_from` impl enforces the finite + `[0.01, 1.0]` invariant at
deserialize time, so the redundant `MIN_UNIQUE_TAG_RATIO..=MAX_UNIQUE_TAG_RATIO`
range check in `common::tags::Generator::new` is removed. The WARN-level
check for values in `[MIN, WARN_UNIQUE_TAG_RATIO]` is preserved.

The new type threads through `MemberGenerator::new`,
`common::tags::Generator::new`, `dogstatsd::common::tags::Generator::new`,
and `opentelemetry::common::TagGenerator::new`. The OTel `UNIQUE_TAG_RATIO`
constant becomes a `const AtLeastOneHundredth` via a `match` on `try_new`.
`MAX_UNIQUE_TAG_RATIO`, no longer referenced outside tests, is now
`#[cfg(test)]`.

At the comparison site in `<common::tags::Generator as Generator>::generate`,
`.get()` extracts the inner `f32` so RNG sequences and bit-exact output are
preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@goxberry goxberry force-pushed the goxberry/probability-multivalue-pack-prob branch from 24aaa41 to ce4d6c2 Compare May 21, 2026 06:47
@goxberry goxberry force-pushed the goxberry/probability-unique-tag-ratio branch from 123b9cc to 4e9ba85 Compare May 21, 2026 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant