I need to GROUP BY two different columns and combine. Furthermore, IF a size with a different group_id matches another size, then I want to include ALL of those rows for that group_id.
For example, since 55 is in group_id 10 and group_id 20 I want to include all of the results in group_id 20 (IE 55,88,55,88).
p_id | group_id | size | the_date
------+----------+-----------+--------
1 | 10 | 55 | 2012-10-24
1 | 10 | 54 | 2014-08-09
1 | 20 | 55 | 2013-05-20
1 | 20 | 88 | 2014-05-20
1 | 20 | 55 | 2015-05-20
1 | 30 | 33 | 2014-05-20
1 | 30 | 55 | 2015-05-20
1 | 30 | 33 | 2015-05-20
1 | 40 | 99 | 2015-05-20
I've looked at some of the other questions on stack AND tried using row_number
and first_value
but nothing has worked. Below is my desired result.
p_id | group_id | size | group_id_arr | size_arr | date_arr
------+----------+------+----------------------------+-----------------------------+-----------
1 | 10 | 55 | {10,20,30,20,10,30,30,20} | {55,55,33,88,55,55,33,55} | {2012-10-24,2013-05-20,2014-05-20,2014-05-20,2014-08-09,2015-05-20,2015-05-20,2015-05-20}
1 | 40 | 99 | {40} | {99} | {2015-05-20}
Here is my DBFiddle where I am trying to achieve the following so I can aggregrate.
rn | p_id | group_id | size | the_date
------+----------+----------+------+-----------
1 | 1 | 10 | 55 | 2012-10-24
2 | 1 | 20 | 55 | 2013-05-20
3 | 1 | 30 | 33 | 2014-05-20
4 | 1 | 20 | 88 | 2014-05-20
5 | 1 | 10 | 54 | 2014-08-09
6 | 1 | 30 | 55 | 2015-05-20
7 | 1 | 30 | 33 | 2015-05-20
8 | 1 | 20 | 55 | 2015-05-20
1 | 1 | 40 | 99 | 2015-05-20
Please let me know if you have any questions. I'm open to all approaches, thanks.
I am not quiet sure about the ordering in the final arrays but this should be a detail you can fix:
WITH sizes as (
SELECT group_id, the_date, array_agg(size) sizes
FROM base_table
GROUP BY group_id, the_date
)
SELECT
a_group_id,
array_agg(b_group_id order by the_date),
array_agg(size order by the_date) as sizes,
array_agg(the_date order by the_date)
FROM (
SELECT
a.group_id as a_group_id,
b.group_id as b_group_id,
unnest(b.sizes) size, b.the_date
FROM sizes a
INNER JOIN sizes b
ON a.sizes && b.sizes
) s
GROUP BY a_group_id
group_id
group_id
.