Search code examples
phparraysmultidimensional-arrayfilteringblacklist

How to filter multidimensional array based on another multidimensional array of exclusions?


I'm attempting to filter a multidimensional array containing (>40,000) products. Each product entry/subarray contains the product id and some product attributes.

I have an associative exclusion array which has 1 or more blacklisted values pertaining to specific attributes.

If a product has any key-value pairs that are specified in my exclusion array, then that product/subarray should be filtered out.

Exclusions array:

$exclusions = [
    'Discontinue Status' => [
        'Discontinued',
        'Run Down Stock',
    ],
    'Hazardous' => [
        'No',
    ],
];

Sample products array:

$products = [
    [
        'Product ID' => '452',
        'Discontinue Status' => 'Discontinued',
        'Hazardous' => 'No',
    ],
    [
        'Product ID' => '463',
        'Discontinue Status' => 'Normal',
        'Hazardous' => 'No',
    ],
    [
        'Product ID' => '477',
        'Discontinue Status' => 'Run Down Stock',
        'Hazardous' => 'Yes',
    ],
    [
        'Product ID' => '502',
        'Discontinue Status' => 'Discontinued',
        'Hazardous' => 'No',
    ],
    [
        'Product ID' => '520',
        'Discontinue Status' => 'Normal',
        'Hazardous' => 'Yes',
    ],
];

Expected output:

[
    [
        'Product ID' => '520',
        'Discontinue Status' => 'Normal',
        'Hazardous' => 'Yes',
    ],
]

I was only able to get as far as returning the correct number of products/items but only the exclusion associated with that item and not the item itself with the following code.

    $exclusions = $this->exclusions;

    $products = [];

    foreach ($array as $product) {

        $filtered = array_filter($product, function ($val, $key) use ($exclusions) { 
                return isset($exclusions[$key]) && !in_array($val, $exclusions[$key]);
            },
            ARRAY_FILTER_USE_BOTH
        ); 

        $products[] = $filtered;

    }  

    $result = array_filter(array_map('array_filter', $products));

    echo '<pre>' . var_export($result, true) . '</pre>';
    echo count($result);

Solution

  • Breaking your exclusion array into separate variables is not advisable because this will make your code harder / more tedious to maintain when you want to modify the exclusion list.

    Iterate your products array just one time.  Inside that loop, loop the attributes in each product having keys which match the first level keys in the exclusion array. This is what array_intersect_key() does best and this prevents performing an unnecessary comparison on the Product ID elements.  As soon as you find a disqualifying condition to be satisfied, stop the inner loop for best efficiency.

    Code #1 (Demo) *my recommendation

    $result = [];
    foreach ($products as $product) {
        foreach (array_intersect_key($product, $exclusions) as $key => $value) {
            if (in_array($value, $exclusions[$key])) {
                continue 2;
            }
        }
        $result[] = $product;
    }
    var_export($result);
    

    Code #2: (Demo)

    foreach ($products as $index => $product) {
        foreach (array_intersect_key($product, $exclusions) as $key => $value) {
            if (in_array($value, $exclusions[$key])) {
                unset($products[$index]);
                break;
            }
        }
    }
    var_export(array_values($products));
    

    Code #3: (Demo)

    var_export(
        array_values(
            array_filter(
                $products,
                function($product) use ($exclusions) {
                    return !array_filter(
                        array_intersect_key($product, $exclusions),
                        function($value, $key) use ($exclusions) {
                            return in_array($value, $exclusions[$key]);
                        },
                        ARRAY_FILTER_USE_BOTH
                    );
                }
            )
        )
    );
    

    Code #4: (Demo)

    var_export(
        array_values(
            array_filter(
                $products,
                fn($product) => !array_filter(
                    array_intersect_key($product, $exclusions),
                    fn($value, $key) => in_array($value, $exclusions[$key]),
                    ARRAY_FILTER_USE_BOTH
                )
            )
        )
    );
    

    Code #1 uses continue 2; to stop the halt the inner loop, avoid storing the current product in the output array, then return to the outer loop to carry on processing the next product.

    Code #2 is directly modifying the $products array.  By unsetting products, the array may cease to be an indexed array (the keys may have gaps between integers). If desirable, call array_values() after the loop to re-index the output.

    Code #3 is using a functional style.  It is common for language constructs (e.g. foreach()) to outperform functional iterators, so I will assume this snippet (and Code #4) will be slightly slower than the first two.  Furthermore, array_filter() doesn't enjoy the early return that the foreach loops have with break and continue.  In other words, Code #3 & #4 will continue checking against the exclusion array even if a disqualifying condition has already been satisfied for a given product. And if that wasn't enough, I simply find the syntax too convoluted.

    Code #4 is the same as Code #3, but is using the slightly shorter "arrow function" syntax which is available from PHP7.4 and up.  This affords the omission of use() and some other characters, but I still find the snippet less intuitive/readable compared to Code #1 & #2.