Search code examples
sqlpostgresqlcsvunnestlateral-join

SQL filtering a text string by unique values


My PostgreSQL database includes a column which is a text string that takes unique but unpredictable values, like so:

id    var
1     "A", "B"
2     "B", "C"
3     "C", "A"
4     "eggs", "toast"
5     "eggs", "bacon"

I need a SQL query which pulls the unique values out of var, like so...

1 A
2 B
3 C
4 bacon
5 eggs
6 toast

...but I'm not too experienced with text stuff in SQL. Help?


Solution

  • You could use regexp_split_to_table() to split these CSV strings, and then retain the distinct values only

    select distinct x.str
    from mytable t
    cross join lateral regexp_split_to_table(t.var, ', ') x(str)
    

    If you really have double quotes around the CSV elements, and you want to remove them from the results, then:

    select distinct replace(x.str, '"', '') str
    from mytable t
    cross join lateral regexp_split_to_table(t.var, ', ') x(str)
    

    For the fun of it, we could also use json: surrounding the strings with square brackets make them valid json array strings, which we can then unnest with jsonb_array_elements_text() (an upside is that it handles the unquoting for us under the hood):

    select distinct x.str
    from mytable t
    cross join lateral jsonb_array_elements_text( ('[' || t.var || ']')::jsonb) x(str)
    

    Demo on DB Fiddle