Search code examples
phparraysfilteringaveragearray-intersect

Average each associative pair found in a 2d array


Consider this collection below:

$collection = [
    [1 => 10.0, 2 => 20.0, 3 => 50.0, 4 => 80.0, 5 => 100.0],
    [3 => 20.0, 5 => 20.0, 6 => 100.0, 7 => 10.0],
    [1 => 30.0, 3 => 30.0, 5 => 10.0, 8 => 10.0]
];

Consider this theorical output based on the intersection of the Arrays contained into $collection, considering their array keys with respective values based on the average of the single values:

$output = Array ( 3 => 33.3333, 5 => 43.3333 );

Can this problem be resolved with a native PHP function like array_intersect_* in an elegant way?

If not, can you suggest me an elegant solution that doesn't necessarily need an outer ugly foreach?

Keep in mind that the number of arrays that need to be intersected is not fixed. It can be 2 input arrays as it can be 1000 input arrays. Keys will be integers at all times, and Values will be floats or integers at all times.

In other words:

$collection = [
    $arr1 = [ ... ];
    $arr2 = [ ... ];
    $arr3 = [ ... ];
    ...
    $arrn = [ ... ];
];
$output = [ intersected and weighted array based (on comparison) on keys from $arr1 to $arrn, and (on values) from the value averages ];

Solution

  • Can this problem be resolved with a native PHP function like array_intersect_* in an elegant way?

    Well, elegance is in the eye of the developer. If functional-style programming with no new globally-scoped variables equals elegance, then I have something tasty for you. Can a native array_intersect_*() call be leveraged in this task? You bet!

    There's a big lack in PHP native functions on intersects - @Maurizio

    I disagree. PHP has a broad suite of powerful, optimized, native array_intersect*() and array_diff*() functions. I believe that too few developers are well-acquainted with them all. I've even build a comprehensive demonstration of the different array_diff*() functions (which can be easily inverted to array_intersect*() for educational purposes).


    Now, onto your task. First, the code, then the explanation.

    Code: (Demo)

    var_export(
        array_reduce(
            array_keys(
                array_intersect_ukey(
                    ...array_merge($collection, [fn($a, $b) => $a <=> $b])
                )
            ),
            fn($result, $k) => $result + [$k => array_sum(array_column($collection, $k)) / count($collection)],
            []
        )
    );
    
    1. The first subtask is to isolate the keys which are present in every row. array_intersect_ukey() is very likely the best qualified tool. The easy part is the custom function -- just write the two parameters with the spaceship in between. The hard part is setting up the variable number of leading input parameters followed by the closure. For this, temporarily merge the closure as an array element onto the collection variable, then spread the parameters into the the native function.
    2. The payload produced by #1 is an array consisting of the associative elements from the first row where the keys were represented in all rows ([3 => 50.0, 5 => 100.0]). To prepare the data for the next step, the keys must be converted to values -- array_keys() is ideal because the float value are of no further use.
    3. Although there is an equal number of elements going into and returning in the final "averaging step", the final result must be a flat associative array -- so array_map() will not suffice. Instead, array_reduce() is better suited. With the collection variable accessible thanks to PHP7.4's arrow function syntax, array_column() can isolate the full column of data then the averaging result pushed as an associative element into the result array.