Search code examples
phpmysqldatabasetagging

Find similar objects that share the most tags


I have two tables objects and tags, each object having an id, and each tag having an id a name and a parent (the id of the object).

What I want to do is to choose an object then find other objects ordered by the amount of tags in common , e.g. to return the 5 most similar objects.

EDIT:

SELECT parent,COUNT(*) as count
FROM `tag` 
WHERE tag="house" OR tag="dog" OR tag="cat" 
GROUP BY parent 
ORDER BY count DESC

This one does what I want, and I could find the obejcts tags "house,dog,cat" with another query before this one. Any idea how I could combine these two queries?


Solution

  • Given one object, you can find its tags like this:

     SELECT t1.id
     FROM tags t1
     where t1.parent_id = ?
    

    Building on that, you want to take that list of tags and find other parent_ids that share them.

     SELECT parent_id, count(*)
     FROM tags t2
     WHERE EXISTS (
         SELECT t1.id
         FROM tags t1
         WHERE t1.parent_id = ?
         AND t1.id = t2.id
     )
     GROUP BY parent_id
    

    That will give you a count of how many tags those other parent_ids share.

    You can ORDER BY count(*) desc if you'd like to find the "most similar" rows first.

    Hope that helps.