Search code examples
sqlgroup-bygoogle-bigqueryconcatenationstring-concatenation

Group Concatenate Strings (Rows) in BigQuery


I am working with Google BigQuery & I have a query that looks like the following:

 SELECT
        prod.abc
        uniqueid,
        variable2,
        cust.variable1,
        purch.variable2,
        from mydata.order
        left join
        UNNEST(purchases) as purch,
        UNNEST(codes_abs) as cod, UNNEST(cod.try_products) as prod

When I do this, this results in a table that looks like this:

    |prod.abc| uniqueid | variable2 | ...|
    |APP123  | customer1| value     | ...|
    |BLU155  | customer1| value     | ...|
    |TRI134  | customer1| value     | ...|
    |LO123   | customer2| value     | ...|
    |ZU9274  | customer2| value     | ...|
    |TO134   | customer3| value     | ...|

What I would like to do is to concatenate values in column "prod.abc", group them by "uniqueid" and separate them by ",". I found numerous solutions online, however, since I have unnested other variables in my query, none of the solutions I found seem to work in my case. The values do not need to be ordered in any way. Basically, what I would like to end up with is:

    |prod.abc                  | uniqueid | variable2 | ...|
    |APP123, BLU155, TRI134    | customer1| value     | ...|
    |LO123, ZU9274             | customer2| value     | ...|
    |TO134                     | customer3| value     | ...|

It would also be okay to get a table like this where duplicates are kept, as I could remove them later on:

|prod.abc                  | uniqueid | variable2 | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|LO123, ZU9274             | customer2| value     | ...|
|LO123, ZU9274             | customer2| value     | ...|
|TO134                     | customer3| value     | ...|

Any help is much appreciated. Thank you!


Solution

  • Do each unnest separately: Does aggregation work?

    SELECT STRING_AGG(item.abc, ',')
           uniqueid, variable2, cust.variable1, purch.variable2
    FROM mydata.order LEFT JOIN
         UNNEST(purchases) as purch
         ON true LEFT JOIN
         UNNEST(codes_abs) as cod
         ON true LEFT JOIN
         UNNEST(cod.try_items) as item
         ON true
    GROUP BY uniqueid, variable2, cust.variable1, purch.variable2;