100% Test coverage all (?) distributions#1399
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1399 +/- ##
===========================================
+ Coverage 95.35% 95.61% +0.25%
===========================================
Files 825 825
Lines 68310 68487 +177
===========================================
+ Hits 65139 65481 +342
+ Misses 3171 3006 -165
... and 7 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
@jzmaddock (or @mborland) you had previously mentioned expanding the existing test coverage. Would this PR be helpful for that? I was just going to keep plugging along for every distribution. Most of the missing tests are for edge cases, or returning NaNs. I've also found a couple of boolean operations that weren't correct which I fixed. There have been a couple cases where something like this appeared |
| BOOST_MATH_CUDA_ENABLED inline bool check_dist_and_prob(const char* function, const RealType& N, RealType p, RealType prob, RealType* result, const Policy& pol) | ||
| { | ||
| if((check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol)) == false) | ||
| if(!(check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol))) |
There was a problem hiding this comment.
I don't mind this change, but the two lines are equivalent I think, so if we end up with too many of these it gets harder to review the PR.
That said, I wonder if we should just simplify the whole thing down to:
return check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol);
There was a problem hiding this comment.
BTW I get the point that the others have a missing set of ()'s. I wonder if those would simplify down to one liners too and be easier to read as a result?
There was a problem hiding this comment.
I don't mind this change, but the two lines are equivalent I think, so if we end up with too many of these it gets harder to review the PR.
I was worried about that. For now, I'll leave these alone and just fix any missing ()'s.
There was a problem hiding this comment.
Actually, there's quite a few of these issues. For example, in exponential.hpp all the checks are written as 0 == detail::verify_lambda(function, lambda, &result, Policy()).
return check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol);
Would this work? Wouldn't the function then return true/false when we want to return the variable result which is defined as
result = policies::raise_domain_error<RealType>(
function,
"The scale parameter \"lambda\" must be > 0, but was: %1%.", l, pol);```
There was a problem hiding this comment.
Actually, there's quite a few of these issues. For example, in exponential.hpp all the checks are written as 0 == detail::verify_lambda(function, lambda, &result, Policy()).
I don't see an issue there? Arguably it should be false == detail::verify_lambda(function, lambda, &result, Policy()) but even that's hardly a necessary change?
Would this work? Wouldn't the function then return true/false when we want to return the variable result which is defined as
I'm looking specifically at the helpers that compound two or more checks (and return a bool), for example:
template <class RealType, class Policy>
BOOST_MATH_CUDA_ENABLED inline bool check_dist_and_prob(const char* function, const RealType& N, RealType p, RealType prob, RealType* result, const Policy& pol)
{
if((check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol)) == false)
return false;
return true;
}
Could be simplified to:
template <class RealType, class Policy>
BOOST_MATH_CUDA_ENABLED inline bool check_dist_and_prob(const char* function, const RealType& N, RealType p, RealType prob, RealType* result, const Policy& pol)
{
return (check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol);
}
?
There was a problem hiding this comment.
Actually, there's quite a few of these issues. For example, in exponential.hpp all the checks are written as 0 == detail::verify_lambda(function, lambda, &result, Policy()).
I don't see an issue there? Arguably it should be
false == detail::verify_lambda(function, lambda, &result, Policy())but even that's hardly a necessary change?
Yes, you're right. Issue wasn't the right word. It's really just a style change. I'll leave these since it doesn't change anything.
Could be simplified to:
template <class RealType, class Policy> BOOST_MATH_CUDA_ENABLED inline bool check_dist_and_prob(const char* function, const RealType& N, RealType p, RealType prob, RealType* result, const Policy& pol) { return (check_dist(function, N, p, result, pol) && detail::check_probability(function, prob, result, pol); }
Yeah, that would consolidate a couple lines and maybe be more readable. I'll won't make these changes since the PR is already quite large (unless you say otherwise).
| BOOST_CHECK((boost::math::isnan)(logpdf(ignore_error_arcsine(-1, 1), static_cast <RealType>(+2)))); // x > x_max | ||
|
|
||
| // CDF | ||
| BOOST_CHECK((boost::math::isnan)(cdf(ignore_error_arcsine(0, 1), std::numeric_limits<RealType>::infinity()))); // x == infinity |
There was a problem hiding this comment.
My only comment here, is that generally I would prefer to see a BOOST_CHECK_THROW and verify that the correct exception is thrown: as long as we're getting the right error handler called (and hence correct exception thrown) then we know it's guaranteed we'll get a NaN when exceptions are off, where as a NaN when exceptions are off does not guarantee the correct exception type when they're on. Hope that makes sense!
It wouldn't hurt to do both of course, but probably the only time we actually NEED both is to check for the correct sign on infinities when an overflow_error is raised.
One final really picky point: we should really hide use of std::numeric_limits<RealType>::infinity() behind a BOOST_MATH_IF_CONSTEXPR(std::numeric_limits<RealType>::has_infinity), likewise for NaN's.
There was a problem hiding this comment.
It wouldn't hurt to do both of course, but probably the only time we actually NEED both is to check for the correct sign on infinities when an overflow_error is raised.
The reason I did both is because we never actually hit the return result line in the code coverage report. Maybe this isn't the best reason to do add it though.
One final really picky point: we should really hide use of
std::numeric_limits<RealType>::infinity()behind aBOOST_MATH_IF_CONSTEXPR(std::numeric_limits<RealType>::has_infinity), likewise for NaN's.
Why does the BOOST_MATH_IF_CONSTEXPR need added?
There was a problem hiding this comment.
Why does the BOOST_MATH_IF_CONSTEXPR need added?
Strictly speaking it's not required and we could use an if, but MSVC emits warnings if something could be if constexpr and isn't, so it just avoids that annoyance. if constexpr is C++17 BTW and we're still C++14 compatible.
|
Really big thanks for taking this on! Yes definitely worth doing, and yes, it's a really tedious old job that mostly throws up a few trivial errors. Gets a big thumbs up from me! |
|
@jzmaddock I had a question regarding the template <class RealType, class Policy>
BOOST_MATH_GPU_ENABLED inline RealType logcdf(const complemented2_type<exponential_distribution<RealType, Policy>, RealType>& c)
{
BOOST_MATH_STD_USING // for ADL of std functions
constexpr auto function = "boost::math::logcdf(const exponential_distribution<%1%>&, %1%)";
RealType result = 0;
RealType lambda = c.dist.lambda();
if(0 == detail::verify_lambda(function, lambda, &result, Policy()))
return result;
if(0 == detail::verify_exp_x(function, c.param, &result, Policy()))
return result;
// Workaround for VC11/12 bug:
if (c.param >= tools::max_value<RealType>())
return 0; // Shouldn't this be -inf?
result = -c.param * lambda;
return result;
} |
|
Most of the test cases that I've written so far have seemed pretty redundant. I'm mainly just checking functions return a NaN if variables go outside their bounds. I'm wondering if there's maybe a better way to do this. Could we define a new class of floating point number which inherits from I just feel like there has to be a cleaner way to do this than copy and pasting tests which probe each variable for each distribution. |
If you follow the logic, if |
Unless I'm misunderstanding, I don't see a way to make this work: we want to be able to instantiate these on float or double, so we need to test that those types do the right thing? But maybe I'm missing something too... |
Hmmm, so is there any need for the check below if this will be caught in // Workaround for VC11/12 bug:
if (c.param >= tools::max_value<RealType>())
return 0; |
I don't think you're missing anything. I'm just doing a lot of copy and pasting of tests which makes me think there's a more systematic way to do this testing. |
|
Another edge case that I found a little weird. The template <class Dist, class Policy>
inline
typename Dist::value_type find_location( // For example, normal mean.
typename Dist::value_type z, // location of random variable z to give probability, P(X > z) == p.
// For example, a nominal minimum acceptable z, so that p * 100 % are > z
typename Dist::value_type p, // probability value desired at x, say 0.95 for 95% > z.
typename Dist::value_type scale, // scale parameter, for example, normal standard deviation.
const Policy& pol
)
{
static_assert(::boost::math::tools::is_distribution<Dist>::value, "The provided distribution does not meet the conceptual requirements of a distribution.");
static_assert(::boost::math::tools::is_scaled_distribution<Dist>::value, "The provided distribution does not meet the conceptual requirements of a scaled distribution.");
static const char* function = "boost::math::find_location<Dist, Policy>&, %1%)";
if(!(boost::math::isfinite)(p) || (p < 0) || (p > 1))
{
return policies::raise_domain_error<typename Dist::value_type>(
function, "Probability parameter was %1%, but must be >= 0 and <= 1!", p, pol);
}
if(!(boost::math::isfinite)(z))
{
return policies::raise_domain_error<typename Dist::value_type>(
function, "z parameter was %1%, but must be finite!", z, pol);
}
if(!(boost::math::isfinite)(scale))
{
return policies::raise_domain_error<typename Dist::value_type>(
function, "scale parameter was %1%, but must be finite!", scale, pol);
}
//cout << "z " << z << ", p " << p << ", quantile(Dist(), p) "
// << quantile(Dist(), p) << ", quan * scale " << quantile(Dist(), p) * scale << endl;
return z - (quantile(Dist(), p) * scale);
} // find_locationIf this is unintended, I'll just change the code above to use |
| return result; | ||
| if(0 == detail::check_finite(function, a, &result, Policy())) | ||
| return result; | ||
| if(0 == detail::check_finite(function, a, &result, Policy())) |
There was a problem hiding this comment.
FYI, I deleted this, and another instance in this file, because they were duplicate checks.
Missed that, given the comment about VC11/12 and the fact that these are obsolete, I would say not any more. |
I wonder if we can't just write a generic set of tests, for example to test invalid support values we could use: And then do something similar for invalid probability values, and finally I guess something similar for distributions which were constructed from invalid parameters (this last one requires no exceptions to be thrown). |
I think unintended and too lax yes. |
I think something like this would be very nice! I think that invalid probability values should be quite easy since we just need to check negative inputs and inputs greater than 1. The invalid distribution construction might be harder. Maybe we should just merge the PR at this point. Then I could open a separate PR to implement the |
|
Yep, OK I've merged. With a bit of improved testing infrastructure, hopefully most of the other distribution tests will just need one or two lines added to call the boiler plate testers. |
Added tests for the following distributions to get 100% test coverage.
find_alphaandfind_beta, there isn't an upper bound on the mean and variance. However, there should be.find_location, I think there might be redundant functions. Just a note to go back and take a look later.Here is the code coverage report for my information.