While searching for regular expressions used for email address validation, i came across this page: http://www.regular-expressions.info/email.html. i couldn't understand it.
it says: \b[A-Z0-9._%+-]+@(?:[A-Z0-9-]+.)+[A-Z]{2,4}\b will match [email protected] but not [email protected].
Can you explain how (?:[A-Z0-9-]+\.)
works in detail and how it doesn't match [email protected]
and matches the other one?
That's because the appearance of a .
is only once, so multiple .
will not be matched. For ..
or ...
etc to be matched, it would have to be \.+
(the +
means once or more, and is the same as {1,}
The regex says (?:[A-Z0-9-]+\.)+
so it is one or more alphanumeric (or underscore), with a dot, and this whole thing can repeat once or more, so c.c.c.
will match, but c..c.c.
will not.
The (?: )
is non-capturing, and is usually faster. You can use ( )
and it works as well, but just slower and the matched text will go into the capturing group.