Search code examples
sqlcsvsplitsnowflake-cloud-data-platformdelimiter

Split by delimiter which is contained in a record


I have a column which I am splitting in Snowflake.

The format is as follows:

enter image description here

I have been using split_to_table(A, ',') inside of my query but as you can probably tell this uncorrectly also splits the Scooter > Sprinting, Jogging and Walking record.

Perhaps having the delimiter only work if there is no spaced on either side of it? As I cannot see a different condition that could work.

I have been researching online but haven't found a suitable work around yet, is there anyone that encountered a similar problem in the past?

Thanks


Solution

  • We can get a little clever with regexp_replace by replacing the actual delimiters with something else before the table split. I am using double pipes '||' but you can change that to something else. The '\|\|\\1' trick is called back-referencing that allows us to include the captured group (\\1) as part of replacement (\|\|)

    set str='car>bike,bike>car,truck, and jeep,horse>cat,truck>car,truck, and jeep';
    
    select $str, *
    from table(split_to_table(regexp_replace($str,',([^>,]+>)','\|\|\\1'),'||'))