I need to group row's in one row if they have same id or same attribute. So i suppose i need to use INNER JOIN
and GROUP_CONCAT
, but i don't know how. Problem is that if two users do not have a common attribute, but fall into the same group with the same third user, all three must be combined into one group. Also i don't have group_id
column in table.
group_id, user_id, group_attributes
1, 1, "red, green, yellow, grey, purple, coffeemaker"
1, 2, "red, green, yellow, grey, purple, coffeemaker"
1, 3, "red, green, yellow, grey, purple, coffeemaker"
1, 4, "red, green, yellow, grey, purple, coffeemaker"
1, 5, "red, green, yellow, grey, purple, coffeemaker"
2, 6, "coffee, milk, croissant"
2, 7, "coffee, milk, croissant"
2, 8, "coffee, milk, croissant"
Raw data to reduce your answer time.
CREATE TABLE task (
user_id INT(10) NOT NULL,
attribute VARCHAR(50) NULL DEFAULT NULL);
INSERT INTO task (user_id, attribute)
VALUES
(1, 'red'),
(1, 'green'),
(2, 'green'),
(2, 'yellow'),
(3, 'grey'),
(3, 'coffeemaker'),
(4, 'grey'),
(4, 'purple'),
(5, 'purple'),
(5, 'red'),
(6, 'black'),
(7, 'black'),
(7, 'milk'),
(8, 'milk'),
(8, 'croissant');
This is a graph-walking problem, so a simple JOIN
is not sufficient. One approach is to get all attributes associates with a given attribute. The following recursive CTE does this:
with recursive aa as (
select distinct t1.attribute as at1, t2.attribute as at2
from task t1 join
task t2
on t1.user_id = t2.user_id
),
cte as (
select at1, at2, at1 as found, 1 as lev
from aa
union all
select cte.at1, aa.at2, concat_ws(',', found, aa.at2), lev + 1
from cte join
aa
on cte.at2 = aa.at1
where find_in_set(aa.at2, found) = 0
)
select distinct at1, at2
from cte;
You can then use the same recursive CTE to combine the values into a string:
with recursive aa as (
select distinct t1.attribute as at1, t2.attribute as at2
from task t1 join
task t2
on t1.user_id = t2.user_id
),
cte as (
select at1, at2, at1 as found, 1 as lev
from aa
union all
select cte.at1, aa.at2, concat_ws(',', found, aa.at2), lev + 1
from cte join
aa
on cte.at2 = aa.at1
where find_in_set(aa.at2, found) = 0
)
select dense_rank() over (order by pairs.all_attributes) as group_id, t.user_id, pairs.all_attributes
from (select at1, group_concat(at2) as all_attributes
from cte
group by at1
) pairs join
(select user_id, min(attribute) as min_attribute
from task
group by user_id
) t
on t.min_attribute = pairs.at1;
I see nothing wrong with this code. But db<>fiddle is insisting on creating a hex string for pairs
. However, I think this will work on your database. Here is the fiddle.