I have been working on a PHP page that calculates probabilities of different outcomes while randomly selecting a sample group from a larger group consisted of two types of people (+
and -
).
For example, it can calculate the probability of having 0 (or n
) smokers in a group of 1000
people randomly chosen from across the United States, considering that 0.15
of Americans are smokers (+
).
It works very well while working with populations below 10000
people, but when it comes to bigger populations of like 1000000
, it echoes 0
for all probabilities unless the precision (number of digits after .
) is increased to like 3000
. Even in that case it takes forever.
The code works by calculating the probability of 0
positives, and doing some calculations on it to get the probability of 1
positive, and so on. This is while most of these probabilities are useless.
I have been thinking that if I can figure out a fast way of calculating almost the exact (99.999%
or higher) value of very big factorials (like 1000000!
), there wouldn't be the need for starting from 0
, and the calculation could have been started from the place where it is needed, and with very low precision to reduce the time it takes.
Here is the code:
<html>
<body>
<form method="get">
target population:
<input type="text" name="tpop"><br><br>
selected population:
<input type="text" name="selected"><br><br>
fraction of positives:
<input type="text" name="fop"><br><br>
output margin:
<input type="text" name="margin"><br><br>
precision:
<input type="text" name="precision"><br><br>
<input type="submit">
</form>
</body>
</html>
<?php
set_time_limit(0);
if (isset($_GET["precision"],$_GET["tpop"],$_GET["selected"],$_GET["fop"],$_GET["margin"])&&$_GET["precision"]!=''&&$_GET["tpop"]!=''&&$_GET["selected"]!=''&&$_GET["fop"]!=''&&$_GET["margin"]!=''&&$_GET["tpop"]>=$_GET["selected"]&&$_GET["fop"]>=0&&$_GET["fop"]<=1){
$tpop=$_GET["tpop"];
$selected=$_GET["selected"];
$fop=$_GET["fop"];
$margin=$_GET["margin"];
$ioioio=0;
$precision=$_GET["precision"];
$minor=($selected*$fop)-$margin;
$maxor=$minor+(2*$margin);
$popopo=bcmul($tpop,$fop);
echo '<br><br>min is'.$minor.'max is'.$maxor.'<br><br>';
$mmm=bcsub($tpop,$selected,$precision);
$rea=bcsub($mmm,1,$precision);
$fops=bcmul($tpop,$fop);
$trss=bcsub($tpop,$fops,$precision);
$trss=bcsub($trss,$selected,$precision);
$trss=bcadd($trss,1,$precision);
while($rea>=$trss){
$mmm=bcmul($mmm,$rea,$precision);
$rea=bcsub($rea,1,$precision);
}
$nnn=$tpop;
$sfg=bcsub($nnn,1,$precision);
$ugt=bcmul($tpop,$fop,$precision);
$uyt=bcsub($tpop,$ugt);
$uyt=bcadd($uyt,1,$precision);
while($sfg>=$uyt){
$nnn=bcmul($nnn,$sfg,$precision);
$sfg=bcsub($sfg,1,$precision);
}
$zero=bcdiv($mmm,$nnn,$precision);
echo '0==>'.$zero.'<br><br>';
$a=$selected;
$b=($tpop-($tpop*$fop)-$selected+1);
$c=1;
$d=($tpop*$fop);
$i=1;
$origzero=$zero;
$save=$zero;
while($i<=$selected){
if($d<=0){
echo $i.'==>impossible<br><br>';
$a--;
$b++;
$c++;
$d--;
$i++;
}else{
$zero=bcmul($zero,$a,$precision);
$zero=bcmul($zero,$d,$precision);
$zero=bcdiv($zero,$b,$precision);
$zero=bcdiv($zero,$c,$precision);
if($i>=$minor){
if($i<=$maxor){
if($i<=$popopo){
$ioioio=bcadd($ioioio,$zero,$precision);
echo 'following value is included in p value<br>';
echo $i.'==>'.$zero.'<br><br>';
}
}
}
$save=bcadd($save,$zero,$precision);
$a--;
$b++;
$c++;
$d--;
$i++;
}
}
echo 'precision==> '.$save.'<br><br>';
$savee = bcsub(1,$save,$precision);
echo '1-precision==> '.$savee.'<br><br>';
if($minor<0||$maxor>$selected){
echo 'p value==>margin larger than surronding probabilties select an smaller margin to calculate p value';
} elseif($minor>0){
echo 'p value==>'.$ioioio;
} else{
$ioioiop=bcadd($ioioio,$origzero,$precision);
echo 'p value(0included)==> '.$ioioiop;}
}
?>
dear @shukshin.ivan many thanks for your response that was exactly what i was looking for : ] this is an example of how it works for anyone else how might have the same question:
$x=950000;
$x =2*$x+1;
$P=pi();
$x =(log(2.0*$P)+log($x/2.0)*$x-$x-(1.0-7.0/(30.0*$x*$x))/(6.0*$x))/2.0;
$x=$x/log(10);
$ex=floor($x);
$x=pow(10,$x-$ex);
$res=$x.'A';
$res=substr($x,0,6).'E'.$ex;
echo $res;
You can use Stirlings approximation. It is rather precise on large numbers. The meaning is that factorial can be calculated as an approximate
A set of other algorithms can be found here.