Skip to content

Conversation

@baogorek
Copy link
Collaborator

Summary

  • Fixes weighted quantile/median to use inverse CDF method instead of interpolation
  • The previous implementation gave incorrect results (e.g., median of [0, 1M] with weights [99, 1] was 10,000 instead of 0)
  • Now matches R's survey::svyquantile(qrule="math") - the gold standard for weighted survey quantiles
  • Verified against R's survey package output

Test plan

  • Added test for skewed distribution (99/1 weights)
  • Added test for boundary conditions (q=0, q=1)
  • Added test for equal weights
  • Added test for unsorted input
  • Verified all test cases against R's survey::svyquantile
  • All existing tests pass

Closes #50

🤖 Generated with Claude Code

The previous implementation used interpolation which gave incorrect
results for weighted survey data. For example, with values [0, 1M] and
weights [99, 1], the median was 10,000 instead of 0.

Now uses the inverse CDF method (smallest value where cumulative weight
>= q), matching R's survey::svyquantile(qrule="math"). Verified against
R's survey package output.

Closes #50

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@baogorek baogorek merged commit bb80b16 into master Dec 1, 2025
8 checks passed
@nwoodruff-co nwoodruff-co deleted the fix-weighted-quantile branch December 18, 2025 13:39
@nwoodruff-co nwoodruff-co restored the fix-weighted-quantile branch December 18, 2025 13:39
@nwoodruff-co nwoodruff-co deleted the fix-weighted-quantile branch December 18, 2025 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make weighted percentiles match unweighted percentiles of stacked data

3 participants