Search code examples
sqlregexsubstringsnowflake-cloud-data-platform

Snowflake Regular Expression


I have this string in Snowflake column:

\[
{
"entryListId": 3279,
"id": 4617,
"name": "SpecTra",
"type": 0
},
{
"entryListId": 3279,
"id": 7455,
"name": "Signal Capital Partners",
"type": 0
}
\]

I need to get names in this format regardless of the number of company names: "SpecTra, Signal Capital Partners". In other words, I need to extract company names and concatenate them.

I have tried this :

regexp_replace(col, '"(\[^"\]+)"|.', '\\1|') 

and regexp_substr() function, but did not get the desired output

Can you please help me with this? Thanks


Solution

  • You can use

    trim(regexp_replace(regexp_replace(col, '"name":\\s*"([^"]+)"|.', '\\1,'), ',+', ','), ',')
    

    Details:

    • "name":\s*"([^"]+)"|. regex matches "name":, then zero or more whitespaces, and a ", and then captures into Group 1 any one or more chars other than " and then matches a " char, and replaces with Group 1 and a comma
    • The second regexp_replace shrinks all commas into a single occurrence of a comma, ,+ matches one or more commas (you may also use a more specific ,{2,} pattern here instead)
    • trim removes commas from start and end.