Search code examples
regexemailgrepemail-validationregex-greedy

Why does grep match my regex lazily?


I am trying to write a simple e-mail regex, and extract the e-mail itself with grep (on Kali linux, if that matters). This is (roughly) my code:

email_regex='([a-zA-Z0-9_.+-]+@[a-zA-Z0-9_-]+(\.[a-zA-Z0-9_-])+)'
egrep -o "$email_regex" e

Where e is a file containing an e-mail address, such as "[email protected]"

The egrep returns "[email protected]".

I tried the following regexes:

  • ([a-zA-Z0-9_.+-]+@([a-zA-Z0-9_-]\.)+[a-zA-Z0-9_-]+) - returned "[email protected]"
  • ([a-zA-Z0-9_.+-]+@[a-zA-Z0-9_-]+\.[a-zA-Z0-9._-]+) - returned "[email protected]", but also detects "[email protected]" as a valid address, and I don't want that.
  • A few other things that also didn't produce good results

Everywhere I looked, I only found questions of how to make grep match lazily, since the default is supposed to be greedy..


Solution

  • This regex should work for you:

    email_regex='[a-zA-Z0-9_.+-]+@[a-zA-Z0-9_-]+(\.[a-zA-Z0-9_-]+)+'
    

    In your regex, last character class [a-zA-Z0-9_-] is missing quantifier +