The goal is to say: "These values lie within a band of 95 % of values around the mean in a normal distribution."
Now, I am trying to convert percentage to z-score, so then I can get the precise range of values. Something like <lower bound , upper bound>
would be enough.
So I need something like
double z_score(double percentage) {
// ...
}
// ...
// according to https://en.wikipedia.org/wiki/68–95–99.7_rule
z_score(68.27) == 1
z_score(95.45) == 2
z_score(99.73) == 3
I found an article explaining how to do it with a function from boost library, but
double z_score( double percentage ) {
return - sqrt( 2 ) / boost::math::erfc_inv( 2 * percentage / 100 );
}
does not work properly and it returns weird values.
z_score(95) == 1.21591 // instead of 1.96
Also the boost library is kinda heavy and I plan to use it for Ruby gem, so it should be as lightweight as possible.
Does anyone have an idea?
I say you were "close enough".
#include <iostream>
#include <boost/math/special_functions/erf.hpp>
#include <cmath>
double z_score(double percentage) {
return sqrt(2) * boost::math::erf_inv(percentage / 100);
}
int main() {
#define _(x) std::cout << x << " " << z_score(x) << "\n"
_(68.27);
_(95.45);
_(99.73);
}
outputs:
68.27 1.00002
95.45 2
99.73 2.99998
I do not know how you got that -
in front, and that it's erf>>c<<_inv
and that it's sqrt(2)
divided by. From here wiki Normal_distribution#Standard_deviation_and_coverage I read that:
p <- this is probability, ie. your input
u <- mean value
o <- std dev
n <- the count of std deviations from mean, ie. 1, 2, 3 etc.
p = F(u + no) - F(u + no) = fi(n) - fi(-n) = erf(n / sqrt(2))
p = erf(n / sqrt(2))
erf_inv(p) = n / sqrt(2)
erf_inv(p) * sqrt(2) = n
n = sqrt(2) * erf_inv(p)
Also the boost library is kinda heavy
A 5 min search resulted in this and this C implementations erf_inv
.