Search code examples
regex

Regex to match email addresses, and common obfuscations


I'm wondering if anyone has a good regex to match email addresses, plus the common ways to obfuscate them, eg "joe [at] foo [dot] com". I'm not looking for a super regex that's completely RFC compliant. For example the following is mostly good enough:

^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$

I just need to tweak it for the most common ways to obfuscate email addresses. Yes, I know some people will outsmart it, and find a way to obfuscate their email addresses in ways that that the regex won't match, but I'm not worried about those situations.

Edit: Please read the whole question. I'm not asking about validating email addresses. I know there are thousands of posts on the web about that. I'm specifically looking into way to detect obfuscated email addresses.


Solution

  • How about something along the lines of this:

     ^[A-Z0-9\._%+-]+(@|\s*\[\s*at\s*\]\s*)[A-Z0-9\.-]+(\.|\s*\[\s*dot\s*\]\s*)[a-z]{2,6}$
    

    Here's an example of it at work: http://regexr.com?2uh92

    In short, it basically makes groups of options at the @ and at the . deliminators, using brackets. You could easily insert (\[|\() instead of the brackets to make them use parentheses optionally, which would match something like hi_there (at) gmail (dot) com.