Search code examples
phppreg-matchimplode

PHP Preg_match each word in a string to find matches with all the items in an array that contains forbidden words


I have a list of forbidden words. I have to check if one of those forbidden words is inside a given string. My current code working fine partially.

A match should be true only and only if:

  1. any of the words in the string is an exact match with any of the forbidden words, e.g.: the pool is cold.
  2. any of the words in the string starts with any of the forbidden words, e.g.: the poolside is yellow.

A match should be false otherwise, and that includes both of these cases which are not currently working fine:

  1. if any of the words in the string ends with any of the forbidden words, e.g.: the carpool lane is closed.
  2. if any of the words in the string contains any of the forbidden words, e.g.: the print spooler is not working.

Current code:

$forbidden = array('pool', 'cat', 'rain');

// example: no matching words at all
$string = 'hello and goodbye'; //should be FALSE - working fine

// example: pool
$string = 'the pool is cold'; //should be TRUE - working fine
$string = 'the poolside is yellow'; //should be TRUE - working fine
$string = 'the carpool lane is closed'; //should be FALSE - currently failing
$string = 'the print spooler is not working'; //should be FALSE - currently failing

// example: cat
$string = 'the cats are wasting my time'; //should be TRUE - working fine
$string = 'the cat is wasting my time'; //should be TRUE - working fine
$string = 'joe is using the bobcat right now'; //should be FALSE - currently failing

// match finder
if(preg_match('('.implode('|', $forbidden).')', $string)) {
    echo 'match!';
} else {
    echo 'no match...';
}

Relevant optimization note: the official $forbidden words array has over 350 items, and the average given $string will have around 25 words. So, it would be great if the solution stops the preg_match process as soon as it finds the first occurrence.


Solution

  • The key is to use \b assertion for word-boundary:

    <?php
    $forbidden = ['pool', 'cat', 'rain'];
    
    // Examples
    $examples = [
        // pool:
        'the pool is cold', //should be TRUE - working fine
        'the poolside is yellow', //should be TRUE - working fine
        'the carpool lane is closed', //should be FALSE - currently failing
        'the print spooler is not working', //should be FALSE - currently failing
    
        // cat:
        'the cats are wasting my time', //should be TRUE - working fine
        'the cat is wasting my time', //should be TRUE - working fine
        'joe is using the bobcat right now', //should be FALSE - currently failing
    ];
    
    $pattern = '/\b(' . implode ('|', $forbidden) . ')/i';
    
    foreach ($examples as $example) {
        echo ((preg_match ($pattern, $example) ? 'TRUE' : 'FALSE') . ': ' . $example . "\n");
    }
    

    http://sandbox.onlinephpfunctions.com/code/f424e6c78d3b13905486f646667c8bc9d48eda3a