I try to match email addresses but only when they are not preceeded with "mailto:". I try this regular expression:
"/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/"
against this string:
'<a href="mailto:[email protected]">EMAIL</a> ... [email protected] '
I would expect to catch only '[email protected]'
, but I also receive '[email protected]'
- see missing 's'
. I wonder what's wrong here. Can't I have a normal regex after the lookbehind assertion?
My whole example in PHP looks like:
$testString = '<a href="mailto:[email protected]">EMAIL</a> ... [email protected] ';
$pattern = "/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/";
preg_match_all($pattern, $testString, $matches);
echo('<pre>');print_r($matches);echo('</pre>');
Thank you!
Because after s
there is a string that matches your regex, [email protected]
, and because s
is hardly mailto:
it matches. Getting a word boundary in there will work for most cases:
Change:
(?<!mailto:)
To:
(?<!mailto:)\b
On a side note: use example.com for examples, domain.com is owned by an actual company.