Skip to content

FDistribution.inverseCumulativeProbability returns incorrect quantile (orders of magnitude error) for small degrees of freedom #458

@yin-mao

Description

@yin-mao

Description

The method FDistribution.inverseCumulativeProbability(p) returns a value that does not satisfy the expected definition of a quantile.

Specifically, the returned value is not the smallest x such that CDF(x) >= p, and is in fact several orders of magnitude larger than the correct solution.


Reproducible Example

double numeratorDf = 0.10006;
double denominatorDf = 1.51904;

FDistribution dist = new FDistribution(numeratorDf, denominatorDf);
double p = 0.16038;

double x = dist.inverseCumulativeProbability(p);

System.out.println("x = " + x);
System.out.println("CDF(x) = " + dist.cumulativeProbability(x));

double x2 = x - 1e-9;
System.out.println("CDF(x - 1e-9) = " + dist.cumulativeProbability(x2));

// Scan for smaller valid x
int steps = 1000000;
double max = 1e-8;

for (int i = 0; i < steps; i++) {
double testX = i * (max / steps);
double cdf = dist.cumulativeProbability(testX);
if (cdf > p) {
System.out.println("First x where CDF > p: " + testX + ", CDF=" + cdf);
break;
}
}


Observed Behavior

  • inverseCumulativeProbability(p) returns:
    x ≈ 1.35e-9

  • CDF(x) ≈ 0.3069, which is significantly larger than p = 0.16038

  • A much smaller value exists:
    x ≈ 3.13e-15
    CDF(x) ≈ 0.16038


Expected Behavior

The method should return a value x such that:

  • CDF(x) ≈ p
  • or at least the smallest x such that CDF(x) >= p

Severity of the issue

This is not a small numerical error.

The returned value is several orders of magnitude larger than the correct solution region:

  • Returned x ≈ 1.35e-9
  • However, values as small as 1e-14 already satisfy CDF(x) > p

This demonstrates that the correct solution lies far below the returned value.

Additionally:

  • CDF(returned x) ≈ 0.3069
  • target p = 0.16038

So the returned value corresponds to a probability almost twice as large as requested.

This indicates that the root-finding algorithm fails to locate the correct region,
rather than suffering from minor floating-point inaccuracies.

Analysis

This behavior suggests a failure in the root-finding process used in inverseCumulativeProbability.

Possible causes include:

  • incorrect bracketing of the root
  • premature convergence
  • numerical instability when the numerator degrees of freedom is very small

The issue appears when the CDF changes very rapidly near zero.


Additional Notes

This issue was originally observed in Apache Commons Math 3.6.1 and appears to persist in Hipparchus.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions