Search code examples
regexoracle-databaseoracle19c

SQL REGEX not working like its expected to


I'm having issues finding a solution for this

set serveroutput on;
declare
EmailRegexp CONSTANT VARCHAR2(1000) :='^[a-z0-9!#$%&''*+/=?^_`{|}~-]+(\.[a-z0-9!#$%&''*+/=?^_`{|}~-]+)*@([a-z0-9]([a-z0-9-]*[a-z0-9])?\.)+([A-Z]{2,})$';
p_vInEmailAddress VARCHAR2(30) := '[email protected]';

begin 
dbms_output.put_line('regex: '||LOWER(SUBSTR(REGEXP_SUBSTR(p_vInEmailAddress, '\.([^.\n\s]*)$'), 2))||''''); 
end;

--results:
--regex: '
--PL/SQL procedure successfully completed.

--now if i do this instead: 
p_vInEmailAddress VARCHAR2(30) := '[email protected]';
--I get results back

--PL/SQL procedure successfully completed.

--regex: com'

What am I doing wrong and why does it not like '.net' part?

How can I fix this?

Thanks


Solution

  • The problematic part of your query can be reduced to:

    begin 
      dbms_output.put_line(
        REGEXP_SUBSTR(
          '[email protected]',
          '\.([^.\n\s]*)$'
        )
      ); 
    end;
    /
    

    The regular expression \.([^.\n\s]*)$ is looking for:

    1. A dot character \.; then
    2. Zero-or-more characters that are not a dot . or a slash \ or an n or a slash \ or an s; then
    3. The end-of-the-string.

    The problem is that your string has an n character and the regular expression is excluding n as \n is interpreted as two characters and not the perl-like expression representing a single newline character. You want to replace \n with the CHR(10) character outside the string literal (or a newline inside it) and \s with the posix-expression [:space:].

    What you want is:

    begin 
      dbms_output.put_line(
        REGEXP_SUBSTR(
          '[email protected]',
          '\.([^.' || CHR(10) || '[:space:]]*)$'
        )
      ); 
    end;
    /
    

    or

    begin 
      dbms_output.put_line(
        REGEXP_SUBSTR(
          '[email protected]',
          '\.([^.
    [:space:]]*)$'
        )
      ); 
    end;
    /
    

    db<>fiddle here