I have a string and a list of names that I compare the string against using preg_match_all
which returns the matches. However in the list of names, some names are first-name OR last-name only while others are both. See my example below.
$names = 'jon|jon snow|lana|smith|lana smith|megan';
$string = 'Jon Snow and Lana Smith met up with Lana and Megan.';
preg_match_all("~\b($names)\b~i", $string, $matches);
The above example with my current expression returns all the names. Which isn't what I want.
What I want returned: jon snow, lana smith, lana, megan.
What I don't want returned: jon, smith
It seems you're looking for negative lookaround assertions.
For example, jon(?! snow)
matches "jon"
, but only if " snow"
does not follow.
$names = 'jon(?! snow)|jon snow|lana(?! smith)|(?<!lana )smith|lana smith|megan';
Test it live on regex101.com.
Another possibility - less explicit but with comparable results - is to ensure that the "composite" terms are tested first:
$names = 'jon snow|jon|lana smith|lana|smith|megan';
Test it live on regex101.com.