Search code examples
phpregexkeyword-spotting

Find names in string using regex without including first names if second name is present


I have a string and a list of names that I compare the string against using preg_match_all which returns the matches. However in the list of names, some names are first-name OR last-name only while others are both. See my example below.

$names = 'jon|jon snow|lana|smith|lana smith|megan';
$string = 'Jon Snow and Lana Smith met up with Lana and Megan.';
preg_match_all("~\b($names)\b~i", $string, $matches);

The above example with my current expression returns all the names. Which isn't what I want.

What I want returned: jon snow, lana smith, lana, megan.

What I don't want returned: jon, smith


Solution

  • It seems you're looking for negative lookaround assertions.

    For example, jon(?! snow) matches "jon", but only if " snow" does not follow.

    $names = 'jon(?! snow)|jon snow|lana(?! smith)|(?<!lana )smith|lana smith|megan';
    

    Test it live on regex101.com.

    Another possibility - less explicit but with comparable results - is to ensure that the "composite" terms are tested first:

    $names = 'jon snow|jon|lana smith|lana|smith|megan';
    

    Test it live on regex101.com.