Search code examples
javascriptjqueryregexemail-validationemail-client

Performance issue while evaluating email address with a regular expression


I am using below regular expression to validate email address.

/^\w+([\.-]?\w+)*@\w+([\.-]?w+)*(\.\w{2,3})+$/

Javascript Code:

var email = 'myname@company.com';

var pattern = /^\w+([\.-]?\w+)*@\w+([\.-]?w+)*(\.\w{2,3})+$/;

if(pattern.test(email)){
    return true;
}

The regex evaluates quickly when I provide the below invalid email:

aseflj#$kajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak@company.com

(I added #$ in the middle of the name)

However when I try to evaluate this email it takes too much time and the browser hangs.

asefljkajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak@company.com1

(I added com1 in the end)

I'm sure that the regex is correct but not sure why its taking so much time to evaluate the second example. If I provide an email with shorter length it evaluates quickly. See the below example

dfjjdkfaklsdfjlak@company.com1

Please help me fix the performance issue


Solution

  • Your regex runs into catastrophic backtracking. Since [\.-]? in ([\.-]?\w+)* is optional, it makes the group degenerates to (\w+)*, which is a classic case of catastrophic backtracking.

    Remove the ? resolves the issue.

    I also remove the redundant escape of . inside character class, and changed the regex a bit.

    ^\w+([.-]\w+)*@\w+([.-]\w+)*\.\w{2,3}$
    

    Do note that many new generic TLDs have more than 3 characters. Even some of the gTLD before the expansion have more than 3 characters, such as .info.

    And as it is, the regex also doesn't support internationalized domain name.