-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Describe the bug
The documentation for the Wilcoxon signed-rank test states:
"This implemented test uses a normal distribution approximation (validated against SciPy with mode='approx')."
(https://fslab.org/FSharp.Stats/Testing.html#Wilcoxon-Test)
However, we are seeing different test results compared to SciPy's implementation on certain sample sets, particularly when ties are involved.
To Reproduce
Use a dataset that fails the first condition of tie correction, triggering a non-trivial correction term i. The following F# snippet demonstrates the tie correction calculation:
let ties =
ranks
|> Array.countBy id
|> Array.filter (fun (i, j) -> j > 1)
|> Array.map (fun (i, j) -> float i, float j)
let tieCorrection (i, j) =
if j = 2.0 then
(j ** 3. - j) / 48.
else
i * ((j ** 3. - j) / 48.)
(snippet from https://github.com/fslaborg/FSharp.Stats/blob/developer/src/FSharp.Stats/Testing/Wilcoxon.fs)
Note: Multiplication by rank i appears to be inconsistent with both SciPy’s implementation of tie correction and the standard mathematical formulation found in statistical literature. Based on this, I believe the correction should not include multiplication by the rank and that the first case should always be used.
Expected behavior
The test results should match SciPy's Wilcoxon test using mode="approx" as stated in the documentation.
Screenshots
No visual interface used; testing was done programmatically.
Additional context
The issue seems to stem from the tie correction formula, where the implementation multiplies the correction by the rank i. This does not align with the standard correction formula used in SciPy and academic references.
Removing this multiplication produces results that match SciPy, suggesting a bug in the tie correction logic.
scipy docs:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html
scipy implementation (starting at line 168):
https://github.com/scipy/scipy/blob/17603e519b3fe2cb3a94dcda99475f3100f23828/scipy/stats/_wilcoxon.py#L168