Search code examples
sqlcsvcommon-workflow-language

SQL query where I get a single column as an array


Here is my data

customer_id, user_id, value, year
c1, u1, 100, 2022
c1, u2, 200, 2022
c1, u3, 300, 2022

I want to get values of all the users of customer c1 in an array after a sql query. How do I achieve that?

SELECT value FROM table WHERE customer_id="c1" AND year="2022" gives me something like

+-------+
| value |
+-------+
|  100  |
|  200  |
|  300  |

I want a query that returns this instead

values
[100, 200, 300]

I tried asking chatgpt this question and it simply asked me to do SELECT ARRAY(SELECT value FROM table WHERE customer_id="c1" AND year="2022") as values But I dont think its correct. But I think chatgpt understood what I was trying to do. Hopefully my question is clear from this fake sql query as well.


Solution

  • LISTAGG is what you need in csvq.

    I don't have csvq but try this:

     ---     ||  this column is the LISTAGG function similar to concat() or Concatenate()
     ---     ||  it is joining the value that would be individual row results into a ',' delimited string
     ---     ||                     || within the listagg grouping (all in this case, 
     ---     ||                     || but you could group on the customer_id and year and remove these
     ---     ||                     || qualifiers from the WHERE syntax below
     ---     ||                     ||           || this sets the order of the values in the list
     ---     ||                     ||           ||              
     SELECT LISTAGG(value, ',') WITHIN GROUP (ORDER BY value) as "values"
     from data.csv 
     WHERE customer_id="c1" AND year="2022")