Search code examples
regexoracle-databaseplsqlemail-validation

Regexp_like vs regex validators online - diferent results


I have a regex expression for email validation using plsql that is giving me some headaches... :) This is the condition I'm using for an email ([email protected]) validation:

IF NOT REGEXP_LIKE (user_email, '^([\w\-\.]+)@((\[([0-9]{1,3}\.){3}[0-9]{1,3}\])|(([\w\-]+\.)+)([a-zA-Z]{2,4}))$') THEN
            control := FALSE;
            dbms_output.put_line('EMAIL '||C.user_email||' not according to regex');
END IF;

If I make a select based on the expression I don't get any values either:

Select * from TABLE_X where REGEXP_LIKE (user_email, '^([\w\-\.]+)@((\[([0-9]{1,3}\.){3}[0-9]{1,3}\])|(([\w\-]+\.)+)([a-zA-Z]{2,4}))$');

Using regex101.com I get full match with this email: [email protected]

Any idea?


Solution

  • The regular expression syntax that Oracle supports is in the documentation.

    It seems Oracle doesn't understand the \w inside the []. You can expand that to:

    with table_x (user_email) as (
      select '[email protected]' from dual
      union all
      select 'bad [email protected]' from dual
    )
    Select * from TABLE_X
    where REGEXP_LIKE (user_email, '^[a-zA-Z_0-9.-]+@((\[([0-9]{1,3}\.){3}[0-9]{1,3}\])|([a-zA-Z_0-9-]+.)+[a-zA-Z]{2,4})$');
    
    USER_EMAIL            
    ----------------------
    [email protected]
    

    You don't need to escape the . or - inside the square brackets, by doing that you would allow literal backslashes to be matched.

    This sort of requirement has come up before - e.g. here - but you seem be allowing IP address octets instead of FQDNs, enclosed in literal square brackets, which is unusual.

    As @BobJarvis said you could also use the [:alnum:] but would still need to include underscore. That could allow non-ASCII 'letter' characters you aren't expecting; though they may be valid, as are other symbols you exclude; you seem to be following the 'common advice' mentioned in that article though.