Conversation
Lazy allocation for coefficient in a readonly struct with static constructor Allocation and computation is done only once, first time we invoke erfinv On the other side, 8kb won't be collected
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7569 +/- ##
==========================================
- Coverage 69.02% 69.02% -0.01%
==========================================
Files 1482 1482
Lines 274096 274099 +3
Branches 28266 28266
==========================================
- Hits 189191 189187 -4
- Misses 77518 77526 +8
+ Partials 7387 7386 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
There was no tests before. Should I add some ? |
rokonec
left a comment
There was a problem hiding this comment.
Thank you for identifying this issue and submitting a fix! The observation that coefficients are recomputed on every call is valid. However, I'm requesting changes for two reasons:
1. Missing use case justification
Erfinv() currently has zero callers in the codebase. Before optimizing it, we need a concrete use case demonstrating that frequent Erfinv calls are needed and that the current per-call cost is actually a bottleneck. Could you provide the scenario where you're hitting this?
2. If optimization is warranted, prefer the Probit-based approach
The existing Probit() function in the same class (ProbabilityFunctions.cs) already implements a high-quality rational polynomial approximation (Beasley-Springer-Moro). Since erfinv(x) = Probit((1+x)/2) / sqrt(2), the entire Taylor series can be eliminated:
public static double Erfinv(double x)
{
if (x > 1 || x < -1)
return Double.NaN;
if (x == 1)
return Double.PositiveInfinity;
if (x == -1.0)
return Double.NegativeInfinity;
return Probit((1.0 + x) / 2.0) / Math.Sqrt(2.0);
}This approach is superior to caching the Taylor series coefficients because:
Performance (benchmarked with 100,000 values x 100 iterations)
| Approach | Time | Speedup vs Original |
|---|---|---|
| Original (per-call alloc) | ~128,512 ms (1 pass) | 1x |
| Cached coefficients (this PR) | 9,973 ms | ~1,290x |
| Probit-based | 96 ms | ~134,000x |
The Probit approach is ~104x faster than caching because it evaluates a degree-7 rational polynomial (~30 FP ops) instead of summing 1,000 series terms (~5,000 FP ops + 8 KB array traversal).
Accuracy - the Taylor series diverges near +/-1
At 100K test values spanning (-1, 1), the 1000-term Taylor series produces incorrect results near the boundaries:
| Metric | Taylor series (Original & Cached) | Probit-based |
|---|---|---|
Max round-trip error abs(Erf(Erfinv(x)) - x) |
2.57e-4 | 1.39e-7 |
| Max divergence from Probit at x~0.99998 | -- | 0.44 (series fails to converge) |
The series approach gives ~1,850x worse accuracy at the tails.
Simplicity
- No new types, no
readonly struct, no#pragmasuppressions, no 8 KB static array - 5 lines of code delegating to an existing, well-tested function
- Zero additional memory
Summary
If a use case for frequent Erfinv calls is provided, the right fix is the one-liner delegating to Probit -- it's faster, more accurate, and simpler. The caching approach preserves the flawed Taylor series and adds structural complexity for a 104x slower result.
Happy to help with the implementation if you'd like to go this route!
Fixes: #7568
Before :
an array of 1000 double is allocated and computed for each erfinv call
Now :
Lazy allocation for coefficient in a readonly struct with static constructor
Allocation and computation is done only once, first time we invoke erfinv
We are excited to review your PR.
So we can do the best job, please check:
Fixes #nnnnin your description to cause GitHub to automatically close the issue(s) when your PR is merged.