Search code examples
phpalgorithmmathstatisticslinear-regression

How to determine if data is increasing or decreasing in PHP


Let's say we have the following data in an array:

$data1 = [3,5,7,6,8,9,13,14,17,15,16,16,16,18,22,20,21,20];

$data2 = [23,18,17,17,16,15,16,14,15,10,11,7,4,5];

As with $data1 we can say the data is increasing while in $data2 it is decreasing.

Using PHP, how do you know the data is increasing or decreasing, and is there a way on how to measure know the rate of increasing as well as decreasing i.e in terms of percentage.

Edit

From the comments I received I got an idea and here is what I have tried. What I want to achieve;

  1. I want to know if the trend of the data coming in is upwards or downwards.
  2. Want also to know the rate at which the data is rising or droping. For example $data1 = [1,3,5]; is not the same as $data2 = [1, 20, 55];. You can see $data1 rate of increase is not the same as $data2.
function increaseOrDecrease($streams = []) : array
{
        $streams = [3,5,7,6,8,9,13,14,17,15,16,16,16,18,22,20,21,20]; // For the increasing

        //$streams = [23,18,17,17,16,15,16,14,15,10,11,7,4,5]; // For the decreasing

        $first = 0;
        $diff = [];

        foreach ($streams as $key => $number) {
            if ($key != 0) {
                $diff[] = $number - $first;
            }
            $first = $number;
        }        

        $avgdifference = array_sum($diff)/count($diff); //Get the average

        $side = $avgdifference > 0 ? 'UP' : 'DOWN';

        $avgsum = array_sum($streams)/count($streams);

        $percentage = abs($avgdifference)/$avgsum * 100;
        
        if ($side == 'UP') {            
            $data = [
                'up' => true,
                'percent' => $percentage,
            ];            
        }else {
            $data = [
                'up' => false,
                'percent' => $percentage,
            ];
        }

        return $data;
}

I would like some help to refactor this code or the best approach to solve the issue.


Solution

  • There are several ways to analyze data and extract a trend. The most classical method is called least squares. It's a way of fitting a line through the data. The method computes the slope and the intercept of the line. The trend is just the slope.

    The formulas are given here.

    A PHP implementation is the following:

    function linearRegression($x, $y)
    {
        $x_sum = array_sum($x);
        $y_sum = array_sum($y);
        $xy_sum = 0;
        $x2_sum = 0;
        $n = count($x);
        for($i=0;$i<$n;$i++)
        {
            $xy_sum += $x[$i] * $y[$i];
            $x2_sum += $x[$i] * $x[$i];
        }
        $beta = ($n * $xy_sum - $x_sum * $y_sum) / ($n * $x2_sum - $x_sum * $x_sum);
        $alpha = $y_sum / $n - $beta * $x_sum / $n;
        return ['alpha' => $alpha, 'beta' => $beta];
    }
    
    function getTrend($data)
    {
        $x = range(1, count($data)); // [1, 2, 3, ...]
        $fit = linearRegression($x, $data);
        return $fit['beta']; // slope of fitted line
    }
    

    Examples:

    echo getTrend([1, 2, 3]); // 1
    echo getTrend([1, 0, -1]); // -1
    echo getTrend([3,5,7,6,8,9,13,14,17,15,16,16,16,18,22,20,21,20]); // 1.065
    echo getTrend([23,18,17,17,16,15,16,14,15,10,11,7,4,5]); // -1.213