Search code examples
phpmathstatisticsfactorial

Highly specific factorial calculations in PHP


I have been working on a PHP page that calculates probabilities of different outcomes while randomly selecting a sample group from a larger group consisted of two types of people (+ and -).

For example, it can calculate the probability of having 0 (or n) smokers in a group of 1000 people randomly chosen from across the United States, considering that 0.15 of Americans are smokers (+).

It works very well while working with populations below 10000 people, but when it comes to bigger populations of like 1000000, it echoes 0 for all probabilities unless the precision (number of digits after .) is increased to like 3000. Even in that case it takes forever.

The code works by calculating the probability of 0 positives, and doing some calculations on it to get the probability of 1 positive, and so on. This is while most of these probabilities are useless.

I have been thinking that if I can figure out a fast way of calculating almost the exact (99.999% or higher) value of very big factorials (like 1000000!), there wouldn't be the need for starting from 0, and the calculation could have been started from the place where it is needed, and with very low precision to reduce the time it takes.

Here is the code:

<html>
<body>
    <form method="get">
        target population:
        <input type="text" name="tpop"><br><br>
        selected population:
        <input type="text" name="selected"><br><br>
        fraction of positives:
        <input type="text" name="fop"><br><br>
        output margin:
        <input type="text" name="margin"><br><br>
        precision:
        <input type="text" name="precision"><br><br>
        <input type="submit">
    </form>
</body>
</html>
<?php
set_time_limit(0);
if (isset($_GET["precision"],$_GET["tpop"],$_GET["selected"],$_GET["fop"],$_GET["margin"])&&$_GET["precision"]!=''&&$_GET["tpop"]!=''&&$_GET["selected"]!=''&&$_GET["fop"]!=''&&$_GET["margin"]!=''&&$_GET["tpop"]>=$_GET["selected"]&&$_GET["fop"]>=0&&$_GET["fop"]<=1){
    $tpop=$_GET["tpop"];
    $selected=$_GET["selected"];
    $fop=$_GET["fop"];
    $margin=$_GET["margin"];
    $ioioio=0;
    $precision=$_GET["precision"];

    $minor=($selected*$fop)-$margin;
    $maxor=$minor+(2*$margin);
    $popopo=bcmul($tpop,$fop);
    echo '<br><br>min is'.$minor.'max is'.$maxor.'<br><br>';

    $mmm=bcsub($tpop,$selected,$precision);
    $rea=bcsub($mmm,1,$precision);
    $fops=bcmul($tpop,$fop);
    $trss=bcsub($tpop,$fops,$precision);
    $trss=bcsub($trss,$selected,$precision);
    $trss=bcadd($trss,1,$precision);
    while($rea>=$trss){
    $mmm=bcmul($mmm,$rea,$precision);
    $rea=bcsub($rea,1,$precision);
}

$nnn=$tpop;
$sfg=bcsub($nnn,1,$precision);
$ugt=bcmul($tpop,$fop,$precision);
$uyt=bcsub($tpop,$ugt);
$uyt=bcadd($uyt,1,$precision);
while($sfg>=$uyt){
    $nnn=bcmul($nnn,$sfg,$precision); 
    $sfg=bcsub($sfg,1,$precision);
}

$zero=bcdiv($mmm,$nnn,$precision);

echo '0==>'.$zero.'<br><br>';
$a=$selected;
$b=($tpop-($tpop*$fop)-$selected+1);
$c=1;
$d=($tpop*$fop);
$i=1;

$origzero=$zero;
$save=$zero;

while($i<=$selected){

    if($d<=0){
        echo $i.'==>impossible<br><br>';
        $a--;
        $b++;
        $c++;
        $d--;
        $i++;
    }else{
        $zero=bcmul($zero,$a,$precision);
        $zero=bcmul($zero,$d,$precision);
        $zero=bcdiv($zero,$b,$precision);
        $zero=bcdiv($zero,$c,$precision);
        if($i>=$minor){
            if($i<=$maxor){
                if($i<=$popopo){
                    $ioioio=bcadd($ioioio,$zero,$precision);
                    echo 'following value is included in p value<br>';
                    echo $i.'==>'.$zero.'<br><br>';
                }
            }
        }
        $save=bcadd($save,$zero,$precision);

        $a--;
        $b++;
        $c++;
        $d--;
        $i++;
    }
}
echo 'precision==> '.$save.'<br><br>';
$savee = bcsub(1,$save,$precision);
echo '1-precision==> '.$savee.'<br><br>';
if($minor<0||$maxor>$selected){
    echo 'p value==>margin larger than surronding probabilties select an smaller margin to calculate p value';
} elseif($minor>0){
    echo 'p value==>'.$ioioio;
} else{
    $ioioiop=bcadd($ioioio,$origzero,$precision);
    echo 'p value(0included)==> '.$ioioiop;}
}
?>

dear @shukshin.ivan many thanks for your response that was exactly what i was looking for : ] this is an example of how it works for anyone else how might have the same question:

$x=950000;
$x =2*$x+1;
$P=pi();
$x =(log(2.0*$P)+log($x/2.0)*$x-$x-(1.0-7.0/(30.0*$x*$x))/(6.0*$x))/2.0;
$x=$x/log(10);
$ex=floor($x);              
$x=pow(10,$x-$ex);
$res=$x.'A';
$res=substr($x,0,6).'E'.$ex;
echo $res;

Solution

  • You can use Stirlings approximation. It is rather precise on large numbers. The meaning is that factorial can be calculated as an approximate

    enter image description here

    A set of other algorithms can be found here.