Let's say I have dogs(dogid)
table and fleas(fleaid)
table and another one dogs_favorite_fleas(dogid,fleaid)
a dog can have his favorite fleas listed in the dog_favorite_fleas
. How can I create a query/temp table lists all the unique sets of fleas and counts them? For example let's say Rex
likes flea a
and flea b
, and Spot
likes 'flea aand
flea c, and
Lassielikes
flea band
flea a`. The result of the query should be:
fleas |count
-------------------------
flea a, flea b |2
flea a, flea c |1
Note: There are ~5000 different fleas and ~4.5 million dogs.
What's the best way of making this query?
Use GROUP_CONCAT
with a GROUP BY
:
SELECT GROUP_CONCAT(f.flea SEPARATOR ', ') AS 'fleas', COUNT(df.dogid) 'Count'
FROM dogs_favorite_fleas df
INNER JOIN fleas f ON df.fleaid = f.id
INNER JOIN dogs d ON df.dogid = d.id
GROUP BY df.dogid
Update: If you want to count how many grouped fleas lists there are, enclose the previous query in a subquery and use GROUP BY
with COUNT
like so:
SELECT fleas, COUNT(fleas) 'COUNT'
FROM
(
SELECT GROUP_CONCAT(f.flea SEPARATOR ', ') AS 'fleas'
FROM dogs_favorite_fleas df
INNER JOIN fleas f ON df.fleaid = f.id
INNER JOIN dogs d ON df.dogid = d.id
GROUP BY df.dogid
ORDER BY f.flea
) t
GROUP BY fleas