Search code examples
replacesubstringsnowflake-cloud-data-platform

Regular expression to get specific pattern in snowflake


I have a table with column data like below.

Column5 :
1) ["[\"( "ABC12345678", "ABC00123451","ABC00543211")\"]"]
2) ["[\"( ABC87654321\"]"]

I just need to clean this column and fetch it like below.

1) ABC12345678,ABC00123451,ABC00543211
2) ABC87654321

Currently I am using replace function repeatedly to clean the data

replace( replace (replace (replace(replace(replace(replace (Column5,'[',''),']',''),'',''),'"',''),'\'',''),')',''),'(','') as column5list

is there any regular expression which can I use for the purpose to clean the data.

Pattern remains same ABC followed by 8 digits


Solution

  • Nested REPLACE could be simplfiied with TRANSLATE:

    SELECT Column5, TRANSLATE(Column5, $$[]"'()$$, '') AS result
    FROM tab;
    

    Sample data:

    CREATE OR REPLACE TABLE tab AS
    SELECT '["[\"( "ABC12345678", "ABC00123451","ABC00543211")\"]"]' AS column5;
    

    Output:

    enter image description here