I have a database which looks like below. It lists different people's favourite colours when they were children or after they became adults. It groups people when they have common favourite colour. For example, ID number 1 and 2 are grouped together as they have red in common. In such case, the second row (ID number 2, group b, favourite colour blue -)also is grouped as b since it is the same ID/person as row 3.
id | group | favourite_colour | from_when | points |
---|---|---|---|---|
1 | b | red | child | 30 |
2 | b | blue | child | 50 |
2 | b | red | adult | 50 |
3 | c | white | adult | 22 |
4 | c | white | adult | 24 |
Based on this, I want to create such a query as follwing:
select
group,
from_when,
array_agg(distinct id),
array_agg(distinct favourite_colour)
from database_example
group by
group,
from_when
having sum(points) > 70
The objectives is to group the people who belong to the same group by 'from_when' column and filter by summed points more than 70.
With the given example, this should return:
group | from_when | id | favourite_colour |
---|---|---|---|
b | child | 1,2 | red,blue |
I want to add one more column in the result which can show how ID 1 and 2 are grouped together because, if we only look at the current query result, it looks like they don't belong to the same group with their different favourite colours. However, in reality, ID 1 and 2 are grouped together as ID 1 (child) and ID 2 (adult) which currently is not shown in the result share the same favourite colour.
As an example, an additional column (connection) like below is the type of results I'd like to include.
group | from_when | id | favourite_colour | connection |
---|---|---|---|---|
b | child | 1,2 | red,blue | 1-red-child,2-red-adult |
How could this be achieved?
You can do it using CTEs.
First CTE is the query where we get data grouped by group and from_when, while second CTE is used to get data grouped by group and favourite_colour and respected same condition, then Using some joins we get the Expected output :
with cte_from_when as (
select group_, from_when,
array_agg(distinct id) as ids,
array_agg(distinct favourite_colour) as colors
from mytable
group by group_, from_when
having sum(points) > 70
),
cte_favourite_colour as (
select group_, favourite_colour
from mytable
group by group_, favourite_colour
having sum(points) > 70
)
select cw.group_, cw.from_when, min(ids) as ids, min(cw.colors) as favourite_colour, array_agg(t.id || '-' || cc.favourite_colour || '-' || t.from_when) as connection
from cte_from_when cw
inner join mytable t on t.group_ = cw.group_ and t.favourite_colour = ANY(colors)
inner join cte_favourite_colour cc on cc.group_ = t.group_ and cc.favourite_colour = t.favourite_colour
group by cw.group_, cw.from_when
Tested on Postgresql, check demo here