Search code examples
sqlruby-on-rails-3tagstagging

Rails 3 - which sql query for related tags


I'm planning this feature now for a long time, and I just can't get started with it really, cause I don't know how to express it in code. Sometimes when I think I got it and know what I want, I suddenly get trapped again and everything stops making sense.

I have tags and taggings, so a has_many through relation with articles. You could call article.tags and tag.articles.

Now, each tag has their show page, basically like stackoverflow. And on this show site I want to list related tags, among others. My approach to these related tags is, that it should be those tags, which most often are tagged as well, at an article, which is tagged with the show tag. I hope this makes some sense.

Example: I'm on /tags/obama, so the related tags should be those that most often are used at articles, that include the tag obama. If I had 4 articles, and 3 of them included 'obama' and all of those 3 as well included the tag 'united_states' for example, then the most related tag to tag 'obama' would be 'united_states'. Sorry if I'm wordy..

I'm not even sure, if this is the best approach to find related tags, but this idea works fine for me. However, I can't implement it.

First I would need to fetch all articles, that include the show tag. So tag.articles. But what's the next step?

tag.articles.each do |article|
  article.tags

... im just getting confused at this point.


Solution

  • I think the best way to solve this is to have a many to many relation between tags, so a tag can have many tags. Then in the relation between two tags, you store the count of how many instances they occur together.

    You could also simply create a new tag-to-tag connection each time the tags occur in the same article. This will however create some redundancy in the database.

    If you do not want to introduce another table, you can get this to work the way you started, except it might be very slow with even a fairly small amount of tags. But here is how I would have done this, if you can not make a Tag-to-tag connection:

    hash_storage = Hash.new(0) #0 is the default value
    tag.articles.each do |article|
       if article.tags.each do |t|
          #we now know that this tag "t" is in the same article as our original tag
          if t!=tag #we don't care if t actually the same as our original tag
              hash_storage[t]+=1
          end
       end
    end
    #Now, this is a bit messy, but we need to sort the hash.
    ordered_tags = hash_storage.map{|k,v| [v,k]}.sort.reverse.map{|a,b| b} #there might be a smarter way of doing this. 
    
    ordered_tags.each do |t|
      #do whatever. the tags should now  be ordered by their relative frequence of occurrance together with the initial tag.
    end    
    

    Hope this helps :)