Search code examples
stringsubstringsnowflake-cloud-data-platform

Snowflake String extract


I have a field with multiple values in one columns separated by pipe (|) delimiter like below:

"PR|20078|ABBV|Ven|1|PS|1|DFT|efficacy|1|0|W1|Key22 CLL EASTERN|Litmus ID: 865avvwr|2022-04-28"

I want to extract "Litmus ID: 865avvwr" from the above string.

This value can show up between any of these 2 pipe delimiter in the data. Can you please help?

Thanks, Anu

I tried using REGEXP_SUBSTR but I'm not getting the syntax right.


Solution

  • Option 1 - use regex substr() to extract the word but if you have multiple words after Litmus use option 2.

    select regexp_substr('W1|Key22 CLL EASTERN|Litmus ID: 865avvwr|2022-04-28','Litmus ID\\W+(\\w+)', 1, 1, 'e', 1) 
    

    Option 2 -
    You can use regex instr, substring and then split part like below.

    split_part(
      substr(mycol,
        regexp_instr(mycol,'Litmus ID')
      )
    ,'|',1    
    ) output
    

    regexp_instr() - is used to deletermine location of the word Litmus ID.
    substr() - is used to cut string from that point forward - like Litmus ID: abcxyz|othrs split_part() - is used to split data based on | and show first part - like Litmus ID: abcxyz.

    sample SQL -

    select split_part(
    substr('W1|Key22 CLL EASTERN|Litmus ID: 865avvwr|2022-04-28',
    regexp_instr('W1|Key22 CLL EASTERN|Litmus ID: 865avvwr|2022-04-28','Litmus ID'))
    ,'|',1    
    ) c