Validating E-mail Ids according to RFC5322 and following
https://en.wikipedia.org/wiki/Email_address
Below is the sample code using java and a regular expression to validate E-mail Ids.
public void checkValid() {
List<String> emails = new ArrayList();
//Valid Email Ids
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("carlosd'[email protected]");
emails.add("[email protected]");
emails.add("admin@mailserver1");
emails.add("[email protected]");
emails.add("\" \"@example.org");
emails.add("\"john..doe\"@example.org");
//Invalid emails Ids
emails.add("Abc.example.com");
emails.add("A@b@[email protected]");
emails.add("a\"b(c)d,e:f;g<h>i[j\\k][email protected]");
emails.add("just\"not\"[email protected]");
emails.add("this is\"not\\[email protected]");
emails.add("this\\ still\"not\\[email protected]");
emails.add("1234567890123456789012345678901234567890123456789012345678901234+x@example.com");
emails.add("[email protected]");
emails.add("[email protected]");
String regex = "^[a-zA-Z0-9_!#$%&'*+/=? \\\"`{|}~^.-]+@[a-zA-Z0-9.-]+$";
Pattern pattern = Pattern.compile(regex);
int i=0;
for(String email : emails){
Matcher matcher = pattern.matcher(email);
System.out.println(++i +"."+email +" : "+ matcher.matches());
}
}
Actual Output:
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
9.carlosd'[email protected] : true
[email protected] : true
11.admin@mailserver1 : true
[email protected] : true
13." "@example.org : true
14."john..doe"@example.org : true
15.Abc.example.com : false
16.A@b@[email protected] : false
17.a"b(c)d,e:f;g<h>i[j\k][email protected] : false
18.just"not"[email protected] : true
19.this is"not\[email protected] : false
20.this\ still"not\[email protected] : false
21.1234567890123456789012345678901234567890123456789012345678901234+x@example.com : true
[email protected] : true
[email protected] : true
Expected Ouput:
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
[email protected] : true
9.carlosd'[email protected] : true
[email protected] : true
11.admin@mailserver1 : true
[email protected] : true
13." "@example.org : true
14."john..doe"@example.org : true
15.Abc.example.com : false
16.A@b@[email protected] : false
17.a"b(c)d,e:f;g<h>i[j\k][email protected] : false
18.just"not"[email protected] : false
19.this is"not\[email protected] : false
20.this\ still"not\[email protected] : false
21.1234567890123456789012345678901234567890123456789012345678901234+x@example.com : false
[email protected] : false
[email protected] : false
How can I change my regular expression so that it will invalidate the below patterns of email ids.
1234567890123456789012345678901234567890123456789012345678901234+x@example.com
[email protected]
[email protected]
just"not"[email protected]
Below are the criteria for regular expression:
Local-part
The local-part of the email address may use any of these ASCII characters:
A to Z
and a to z
;0 to 9
;.
, provided that it is not the first or last character unless
quoted, and provided also that it does not appear consecutively
unless quoted (e.g. [email protected]
is not allowed but
"John..Doe"@example.com
is allowed);space
and "(),:;<>@[\]
characters are allowed with restrictions
(they are only allowed inside a quoted string, as described in the
paragraph below, and in addition, a backslash or double-quote must
be preceded by a backslash); comments are allowed with parentheses
at either end of the local-part; e.g.
john.smith(comment)@example.com
and
(comment)[email protected]
are both equivalent to
[email protected]
.Domain
A to Z
and a to z
;0 to 9
, provided that top-level domain names are not
all-numeric;-
, provided that it is not the first or last character.
Comments are allowed in the domain as well as in the local-part; for
example, john.smith@(comment)example.com
and
[email protected](comment)
are equivalent to
[email protected]
.You could RFC5322 like this
( reference regex modified )
"(?im)^(?=.{1,64}@)(?:(\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"@)|((?:[0-9a-z](?:\\.(?!\\.)|[-!#\\$%&'\\*\\+/=\\?\\^`\\{\\}\\|~\\w])*)?[0-9a-z]@))(?=.{1,255}$)(?:(\\[(?:\\d{1,3}\\.){3}\\d{1,3}\\])|((?:(?=.{1,63}\\.)[0-9a-z][-\\w]*[0-9a-z]*\\.)+[a-z0-9][\\-a-z0-9]{0,22}[a-z0-9])|((?=.{1,63}$)[0-9a-z][-\\w]*))$"
https://regex101.com/r/ObS3QZ/1
# (?im)^(?=.{1,64}@)(?:("[^"\\]*(?:\\.[^"\\]*)*"@)|((?:[0-9a-z](?:\.(?!\.)|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)?[0-9a-z]@))(?=.{1,255}$)(?:(\[(?:\d{1,3}\.){3}\d{1,3}\])|((?:(?=.{1,63}\.)[0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9])|((?=.{1,63}$)[0-9a-z][-\w]*))$
# Note - remove all comments '(comments)' before runninig this regex
# Find \([^)]*\) replace with nothing
(?im) # Case insensitive
^ # BOS
# Local part
(?= .{1,64} @ ) # 64 max chars
(?:
( # (1 start), Quoted
" [^"\\]*
(?: \\ . [^"\\]* )*
"
@
) # (1 end)
| # or,
( # (2 start), Non-quoted
(?:
[0-9a-z]
(?:
\.
(?! \. )
| # or,
[-!#\$%&'\*\+/=\?\^`\{\}\|~\w]
)*
)?
[0-9a-z]
@
) # (2 end)
)
# Domain part
(?= .{1,255} $ ) # 255 max chars
(?:
( # (3 start), IP
\[
(?: \d{1,3} \. ){3}
\d{1,3} \]
) # (3 end)
| # or,
( # (4 start), Others
(?: # Labels (63 max chars each)
(?= .{1,63} \. )
[0-9a-z] [-\w]* [0-9a-z]*
\.
)+
[a-z0-9] [\-a-z0-9]{0,22} [a-z0-9]
) # (4 end)
| # or,
( # (5 start), Localdomain
(?= .{1,63} $ )
[0-9a-z] [-\w]*
) # (5 end)
)
$ # EOS
How make [email protected] this as valid email ID – Mihir Feb 7 at 9:34
I think the spec wants the local part to be either encased in quotes
or, to be encased by [0-9a-z]
.
But, to get around the later and make [email protected]
valid, just
replace group 2 with this:
( # (2 start), Non-quoted
[0-9a-z]
(?:
\.
(?! \. )
| # or,
[-!#\$%&'\*\+/=\?\^`\{\}\|~\w]
)*
@
) # (2 end)
New regex
"(?im)^(?=.{1,64}@)(?:(\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"@)|([0-9a-z](?:\\.(?!\\.)|[-!#\\$%&'\\*\\+/=\\?\\^`\\{\\}\\|~\\w])*@))(?=.{1,255}$)(?:(\\[(?:\\d{1,3}\\.){3}\\d{1,3}\\])|((?:(?=.{1,63}\\.)[0-9a-z][-\\w]*[0-9a-z]*\\.)+[a-z0-9][\\-a-z0-9]{0,22}[a-z0-9])|((?=.{1,63}$)[0-9a-z][-\\w]*))$"
New demo