Search code examples
sqlarraysaggregate-functionspresto

Arraying two array rows and display only distinct values


I want to aggregate two array rows. For example, in the table below, colour column already shows the values in array format.

id colour
1 [[Red,Blue,Yellow]]
2 [[Green]]
1 [[White,Blue]]
2 [[Green,Black]]

By aggregating on id and distinct-selecting the colour column, I want to create a result like below.

id colour
1 [[Red,Blue,Yellow,White]]
2 [[Green,Black]]

How do I achieve this?


Solution

  • Presto has lots of array functions. We can try:

    select id, array_distinct(flatten(array_agg(colour))) as colours
    from mytable
    group by id
    

    This assumes that each array contains a list of scalars, so something like: [Red,Blue,Yellow] - not a one element sub-array (as shown in your sample data like [[Red,Blue,Yellow]]).