Skip to content

Add t-digest 3.2 vs 3.3 comparison and merge-order reproducer#18166

Open
xiangfu0 wants to merge 1 commit intoapache:claude/crazy-wilburfrom
xiangfu0:tdigest-reproducer-only
Open

Add t-digest 3.2 vs 3.3 comparison and merge-order reproducer#18166
xiangfu0 wants to merge 1 commit intoapache:claude/crazy-wilburfrom
xiangfu0:tdigest-reproducer-only

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 commented Apr 10, 2026

Summary

  • add a direct t-digest 3.2 vs 3.3 comparison test on a deterministic Pinot-like hierarchical merge dataset
  • keep the pure 3.3 merge-order reproducer that shows compression 500 restoring stable hierarchical merges
  • copy com.tdunning:t-digest:3.2 into the test target so both versions can be exercised side-by-side in the same Surefire JVM

Validation

  • ./mvnw -pl pinot-segment-local -Dtest=TDigestVersionComparisonTest,TDigestMergeOrderReproducerTest -Dsurefire.failIfNoSpecifiedTests=false test
  • same command passed in 10 fresh Surefire JVM runs (10/10)
  • ./mvnw -pl pinot-segment-local spotless:apply
  • ./mvnw -pl pinot-segment-local checkstyle:check
  • ./mvnw -pl pinot-segment-local license:format
  • ./mvnw -pl pinot-segment-local license:check

Comparison Signal

On the minimized exact-quantile comparison scenario, the deterministic dataset shows a clear gap:

  • 3.2 @ compression 100: about 0.000074 max normalized error with 121 centroids
  • 3.3 @ compression 100: about 0.005094 max normalized error with 55 centroids
  • 3.3 @ compression 150: about 0.000049 max normalized error with 79 centroids

Context

This PR is intentionally split out from #18103 so the reproducer and the direct 3.2 vs 3.3 comparison can be reviewed independently from the production compression change.

@xiangfu0 xiangfu0 force-pushed the tdigest-reproducer-only branch from 36b7322 to 2da64c8 Compare April 11, 2026 00:48
@xiangfu0 xiangfu0 changed the title Add pure t-digest merge-order reproducer Add t-digest 3.2 vs 3.3 comparison and merge-order reproducer Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant