Search code examples
databaseruby-on-rails-3database-performance

About Youtube views count


I'm implementing an app that keeps track of how many times a post is viewed. But I'd like to keep a 'smart' way of keeping track. This means, I don't want to increase the view counter just because a user refreshes his browser.

So I decided to only increase the view counter if IP and user agent (browser) are unique. Which is working so far.

But then I thought. If Youtube, is doing it this way, and they have several videos with thousands or even millions of views. This would mean that their views table in the database would be overly populated with IP's and user agents....

Which brings me to the assumption that their video table has a counter cache for views (i.e. views_count). This means, when a user clicks on a video, the IP and user agent is stored. Plus, the counter cache column in the video table is increased.

Every time a video is clicked. Youtube would need to query the views table and count the number of entries. Won't this affect performance drastically?

Is this how they do it? Or is there a better way?


Solution

  • First of all, afaik, youtube uses BigTable, so do not worry about querying the count, we don't know the exact structure of the database anyway.

    Assuming that you are on a relational model, create a column view_count, but do not update it on every refresh. Record the visists and periodically update the cache.

    Also, you can generate hash from IP, browser, date and any other information you are using to detect if this is an unique view, and do not store the whole data.

    Also, you can use session/cookie to record the view being viewed. Since it will expire, it won't be such memory problem - I don't believe anyone is viewing thousand of videos in one session