Search code examples
phparraysmultidimensional-arraysumgrouping

Group 2d array data by two columns then sum a third column to create a 3d array


I need to group the data in my 2d array by one column then by another column and sum a third column in each subset.

Input data:

$orders = [
    ["date" => "20230320", "model" =>  "ABC", "quantity"  => 1],
    ["date" => "20230320", "model" =>  "DEF", "quantity"  => 2],
    ["date" => "20230320", "model" =>  "ABC", "quantity"  => 3],
    ["date" => "20230320", "model" =>  "DEF", "quantity"  => 5],
    ["date" => "20230321", "model" =>  "ABC", "quantity"  => 1],
];

Specifically, I'd like to restructure the data to group by date as the first level keys, then group by unique model subarray where the quantity value are summed.

Desired result:

[
    20230320 => [
        ['date' => 20230320, 'model' => 'ABC', 'quantity' => 4]
        ['date' => 20230320, 'model' => 'DEF', 'quantity' => 7]
    ],
    20230321 => [
        ['date' => 20230321, 'model' => 'ABC', 'quantity' => 1]
    ]
]

After struggling some hours, I was able to reach my goal proceeding step by step, but I'm wondering if a simpler solution exist.

This is the complete working code:

// STEP 1 - GROUP ITEMS BY DATE
$orders2 = array();
foreach($orders as $order) {
    $date = $order['date'];
    $orders2[$date][] = $order;
} 
print_r ($orders2);

// STEP 2 - GROUP ITEMS BY MODEL AND SUM QUANTITY
$orders3 = array();
foreach($orders2 as $items) {
    foreach($items as $item) {
        $date = $item['date'];
        $model = $item['model'];
        if(empty($orders3[$date][$model])) {
            $orders3[$date][$model][] = $item;
        } else {
            $orders3[$date][$model][0]['quantity'] += $item['quantity'];    
        }
    } 
}
print_r ($orders3);

// STEP 3 - REMOVE NOT NECESSARY MODEL LEVEL
$orders4 = array();
foreach($orders3 as $items) {
    foreach($items as $item) {
        $date = $item[0]['date'];
        $orders4[$date][] = $item[0];
    } 
}
print_r ($orders4);

What do you suggest? Any simpler way to obtain the same result in a more straightforward way?


Solution

  • This is a great opportunity to implement reference variables.

    There are two separate groupings in this task:

    1. Group by the date and
    2. Group by the model within each date.

    This can be done by temporarily generating an associative array of associative arrays, but then the populated array will need to be re-iterated to re-index the subarray keys.

    Instead, declare references to the date-model subsets, and push them into the result.

    Code: (Demo)

    $result = [];
    foreach ($array as $row) {
        $key = $row['date'] . ' ' . $row['model']; 
        if (!isset($ref[$key])) {
            $ref[$key] = $row;
            $result[$row['date']][] = &$ref[$key];
        } else {
            $ref[$key]['quantity'] += $row['quantity'];
        }
    }
    var_export($result);
    

    In the above, date and model values are joined into a single string to form a "composite key". New entries only get pushed into the result array if the unique combination of date-model values is encountered for the first time. If the composite key was already encountered in an earlier result, then only the quantity of the current row is added to the quantity of the reference's quantity.