The dataset is like this:
g1 = {d1,d3,d5}
g2 = {d1,d3,d2}
g3 = {d4,d3,d5}
I have the following table called director_genre:
director_id | genre
d1 g1
d1 g2
d2 g2
d3 g1
d3 g2
d3 g3
d4 g3
d5 g1
d5 g3
The output I want is
genre | director_id
g1 d1,d3,d5
g2 d1,d3
g3 d3,d5
The query I thought of is
SELECT genre
FROM director_genre
GROUP BY genre;
After this, I am not to put the logic as asked in the question. Please help me to get this done.
I would suggest the following query:
SELECT d.genre, GROUP_CONCAT(director_id)
FROM director_genre d
WHERE EXISTS (SELECT *
FROM director_genre d2
WHERE d.director_id = d2.director_id AND d.genre <> d2.genre
)
GROUP BY d.genre;
The nested subquery will select only records of directors who are associated with at least 2 genres (for each record in d
it looks whether there exists a record in d2
with the same director_id
but different genre
). Then they will be grouped by genre
.