Search code examples
phparraysduplicatesfilteringpartial-matches

Remove elements in a flat array where one of the words in their string was encountered within a previously encountered element


I want to do something like this.

$str = array(
    "Lincoln Crown",
    "Crown Court",
    "go holiday",
    "house fire",
    "John Hinton",
    "Hinton Jailed"
);

In this array, Lincoln Crown contains Lincoln and Crown, so remove any future words which contain either of these words, such as Crown Court (contains Crown).

In another case, John Hinton contains John and Hinton, so Hinton Jailed should be removed because it contains Hinton.

The final output should be like this:

array(
    "Lincoln Crown",
    "go holiday",
    "house fire",
    "John Hinton"
);

I have unsuccessfully tried using array_unique() and array_diff().


Solution

  • I think this might work :P

    function cool_function($strs){
        // Black list
        $toExclude = array();
    
        foreach($strs as $s){
            // If it's not on blacklist, then search for it
            if(!in_array($s, $toExclude)){
                // Explode into blocks
                foreach(explode(" ",$s) as $block){
                    // Search the block on array
                    $found = preg_grep("/" . preg_quote($block) . "/", $strs);
                    foreach($found as $k => $f){
                        if($f != $s){
                            // Place each found item that's different from current item into blacklist
                            $toExclude[$k] = $f;
                        }
                    }
                }
            }
        }
    
        // Unset all keys that was found
        foreach($toExclude as $k => $v){
            unset($strs[$k]);
        }
    
        // Return the result
        return $strs;
    }
    
    $strs = array("Lincoln Crown","Crown Court","go holiday","house fire","John Hinton","Hinton Jailed");
    print_r(cool_function($strs));
    

    Dump:

    Array
    (
        [0] => Lincoln Crown
        [2] => go holiday
        [3] => house fire
        [4] => John Hinton
    )