Search code examples
phparraysmergesumgrouping

Group rows in a multidimensional array and sum the "count" elements in each group


I have an array of associative arrays and I would like to group them by the rows by the values for A, B, and C and sum the count value for each group.

$array = [
    ['A' => 'O',    'B' => 'O',     'C' => 1, 'count' => 1],
    ['A' => 'Test', 'B' => 'Test',  'C' => 1, 'count' => 1],
    ['A' => 'O',    'B' => 'O',     'C' => 1, 'count' => 1],
    ['A' => 'Test', 'B' => 'Test',  'C' => 1, 'count' => 1],
    ['A' => 'Test', 'B' => 'test1', 'C' => 2, 'count' => 1],
];

I need a result like this:

[
    ["A" => "O",    "B" => "O",     "C" => 1, "count" => 2],
    ["A" => "Test", "B" => "Test",  "C" => 1, "count" => 2],
    ["A" => "Test", "B" => "test1", "C" => 2, "count" => 1]  
]

In order to make this you need to loop through the array and check where the attributes "A", "B", "C" are equal. I tried doing this, but I couldn't fix it.

$countedArray[0] = $array[0];
foreach ($array as $item) {
    $occKey = array_filter(
        $countedArray,
        function ($countedItem, $key) use ($array) {
            if ($countedItem['A'] == $item['A']
                && $countedItem['B'] == $item['B']
                && $countedItem['C'] == $item['C']
            ) {
                $countedItem[$key]['count'] = countedItem[$key]['count'] + 1
            } else {
                array_push(
                    $countedArray,
                    [
                        'A' => $item['A'],
                        'B' => $item['B'],
                        'C' => $item['C'],
                        'count' => 1
                    ]
                );
            }
        },
        ARRAY_FILTER_USE_BOTH
    );
}

Solution

  • I've done my best to make it less verbose. I welcome any suggestions. Here's my proposed solution:

    function sumOccurrences(array $original): array
    {
        $summed = [];
        foreach ($original as $value) {
            // here we get the array without the 'count' key - everything we need to compare
            $comparisonElement = array_filter($value, function ($key) {
                return $key !== 'count';
            }, ARRAY_FILTER_USE_KEY);
            // we search with strict comparison (third param - true) - see reasoning below
            $foundAt = array_search($comparisonElement, array_column($summed, 'element'), true);
            if ($foundAt === false) {
                // we separate the values we compare and the count for easier handling
                $summed[] = ['element' => $comparisonElement, 'count' => $value['count']];
            } else {
                // if we've run into an existing element, just increase the count
                $summed[$foundAt]['count'] += $value['count'];
            }
        }
    
        // since we separated count from the values for comparison, we have to merge them now
        return array_map(function ($a) {
            // $a['count'] is wrapped in an array as it's just an integer
            return array_merge($a['element'], ['count' => $a['count']]);
        }, $summed);
    }
    

    In order to make it less verbose, I've opted to compare arrays directly. Other than being less verbose, another benefit is that this will work if additional key => value pairs are introduced to the array without any addition to logic. Everything that is not count gets compared, no matter how many pairs exist. It will also cover any nested arrays (for example 'C' => ['D' => 1]).

    But, this comes at a price - we must use strict comparison because loose can give undesired results (for example, ['a'] == [0] will return true). Strict comparison also means that it won't work if any values are objects (strict comparison means it's checking for the same instance) and that arrays will only be matched if they have the same key => value pairs in the same order. This solution assumes that your array (and any nested ones) are already sorted.

    If that is not the case, we'd have to sort it before comparing. Normally, ksort would do the job, but to support nested arrays, we'd have to devise a recursive sort by key:

    function ksortRecursive(array &$array): void
    {
        ksort($array);
        foreach ($array as &$value) {
            if (is_array($value)) {
                ksortRecursive($value);
            }
        }
    }
    

    and call it before we do array_search.

    Now if we assume a starting array like in your example, the following should give you the desired result:

    $original = [
        ['A' => 'O', 'B' => 'O', 'C' => 1, 'count' => 1],
        ['A' => 'Test', 'B' => 'Test', 'C' => 1, 'count' => 1],
        ['A' => 'O', 'B' => 'O', 'C' => 1, 'count' => 1],
        ['A' => 'Test', 'B' => 'Test', 'C' => 1, 'count' => 1],
        ['A' => 'Test', 'B' => 'test1', 'C' => 2, 'count' => 1],
    ];
    var_dump(sumOccurrences($original));