Search code examples
pattern-matchinglogstash-grokgrok

Grok pattern to match email address


I have the following Grok patterns defined in a pattern file

HOSTNAME \b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b)
EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILLOCALPART}@%{HOSTNAME}

For some reason this doesn't compile when run against http://grokdebug.herokuapp.com/ with the following input, it simply returns "Compile error"

Node1\Spam.log.2016-05-03   171 1540699703 03/May/2016 00:00:01 +0000  INFO  [http-bio-0.0.0.0-8001-exec-20429] EngagementServiceImpl logDefault 192.168.1.122 77777777777777777 DAMIEN@DAMIEN.COM > initiated Stuff: 8675309, provider: 8675309, member: 8675309

Is there some reason I'm getting a compile error / will this even match the email in that log line?

Thanks,


Solution

  • You may use

    (?<email>[a-zA-Z0-9_.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:\.(?:[0-9A-Za-z][0-‌​9A-Za-z-]{0,62}))*)
    

    or:

    (?<email>[\w.+=:-]+@[0-9A-Za-z][0-9A-Za-z-]{0,62}(?:[.](?:[0-9A-Za-z][0-9A-Za-z‌​-]{0,62}))*)
    

    They work at grokdebug.herokuapp.com. BTW, https://github.com/rgevaert/grok-patterns/blob/master/grok.d/postfix_patterns defines the email pattern differently: EMAILADDRESS %{EMAILADDRESSPART:local}@%{EMAILADDRESSPART:remote}, it may also work.