Search code examples
snowflake-cloud-data-platformregexp-like

alternative to below regular expression in snowflake


I am trying to use the oracle regular expression code in snowflake, but getting null value as result where as it is working as expected in snowflake.

Requirement: Compare post code and return result if it matches the format.

SELECT  
   CASE WHEN REGEXP_LIKE('AB101TZ','^[A-Z]{2}[0-9]+') THEN 'AB101TZ'
         WHEN REGEXP_LIKE('AB101TZ','^[A-Z][0-9]+') THEN 'AB101TZ'
         ELSE '-'
     END postcode_part_a

What modifications need to do this expression


Solution

  • So as Greg noted that REGEX functions automatically anchor, thus the ^ & $ tokens are not "needed" but if you want an open tail then you need to add a .*.

    SELECT column1
        ,REGEXP_LIKE(column1,'[A-Z]{2}[0-9]+') as A
        ,REGEXP_LIKE(column1,'[A-Z][0-9]+') as B
        ,REGEXP_LIKE(column1,'[A-Z]{2}[0-9]{1,3}[A-Z]{2}') as C
        ,REGEXP_LIKE(column1,'[A-Z]{1,2}[0-9]+.*') as D
    FROM VALUES ('AB101TZ'), ('_AB101TZ'), ('AA0000'), ('A00000');
    

    thus this gives:

    COLUMN1   A       B       C       D
    AB101TZ   FALSE   FALSE   TRUE    TRUE
    _AB101TZ  FALSE   FALSE   FALSE   FALSE
    AA0000    TRUE    FALSE   FALSE   TRUE
    A00000    FALSE   TRUE    FALSE   TRUE
    

    so A & B are your to matches, but with the '$' removed to show what they do match as is. And thus why your input in not matching because TZ is not a number

    C is Greg's solution. Which will not what your second filter ^[A-Z][0-9]+' would match, thus I made D which allows 1 or 2 characters, then some numbers, then anything. Anyways it should be possible to see how to mix and match those matching parts to match the data you have, in the format that is correct for you.