Search code examples
ruby-on-railsactiverecordeager-loadingactive-relation

How do I get Rails to eager load counts?


This is related to a question a year and change ago.

I put up an example of the question that should work out of the box, provided you have sqlite3 available: https://github.com/cairo140/rails-eager-loading-counts-demo

Installation instructions (for the main branch)

git clone git://github.com/cairo140/rails-eager-loading-counts-demo.git
cd rails-eager-loading-counts-demo
rails s

I have a fuller write-up in the repository, but my general question is this.

How can I make Rails eager load counts in a way that minimizes db queries across the board?

The n+1 problem emerges whenever you use #count on an association, despite having included that association via #includes(:associated) in the ActiveRelation. A workaround is to use #length, but this works well only when the object it's being called on has already been loaded up, not to mention that I suspect it duplicates something that the Rails internals have done already. Also, an issue with using #length is that it results in an unfortunate over-loading when the association was not loaded to begin with and the count is all you need.

From the readme:

We can dodge this issue by running #length on the posts array (see appendix), which is already loaded, but it would be nice to have count readily available as well. Not only is it more consistent; it provides a path of access that doesn't necessarily require posts to be loaded. For instance, if you have a partial that displays the count no matter what, but half the time, the partial is called with posts loaded and half the time without, you are faced with the following scenario:

  • Using #count
    • n COUNT style queries when posts are already loaded
    • n COUNT style queries when posts are not already loaded
  • Using #length
    • Zero additional queries when posts are already loaded
    • n * style queries when posts are not already loaded

Between these two choices, there is no dominant option. But it would be nice to revise #count to defer to #length or access the length that is some other way stored behind the scenes so that we can have the following scenario:

  • Using revised #count
    • Zero additional queries when posts are already loaded
    • n COUNT style queries when posts are not already loaded

So what's the correct approach here? Is there something I've overlooked (very, very likely)?


Solution

  • It appears that the best way to implement this sort of facility might be to create SQL Views (ref: here and here) for the seperate model-and-child-count objects that you want; and their associated ActiveRecord models.

    You might be able to be very clever and use subclassing on the original model combined with set_table_name :sql_view_name to retain all the original methods on the objects, and maybe even some of their associations.

    For instance, say we were to add 'Post.has_many :comments' to your example, like in @Zubin's answer above; then one might be able to do:

       class CreatePostsWithCommentsCountsView < ActiveRecord::Migration
          def self.up
            #Create SQL View called posts_with_comments_counts which maps over 
            # select posts.*, count(comments.id) as comments_count from posts 
            #   left outer join comments on comments.post_id = posts.id 
            #   group by posts.id
            # (As zubin pointed out above.) 
            #*Except* this is in SQL so perhaps we'll be able to do further 
            # reducing queries against it *as though it were any other table.*
          end    
       end
    
       class PostWithCommentsCount < Post         #Here there be cleverness.
                                                  #The class definition sets up PWCC 
                                                  # with all the regular methods of 
                                                  # Post (pointing to the posts table
                                                  # due to Rails' STI facility.)
    
        set_table_name :posts_with_comment_counts #But then we point it to the 
                                                  # SQL view instead.
                                                  #If you don't really care about
                                                  # the methods of Post being in PWCC
                                                  # then you could just make it a 
                                                  # normal subclass of AR::Base.
       end
    
       PostWithCommentsCount.all(:include => :user)  #Obviously, this sort of "upward
         # looking" include is best used in big lists like "latest posts" rather than
         # "These posts for this user." But hopefully it illustrates the improved 
         # activerecordiness of this style of solution.
       PostWithCommentsCount.all(:include => :comments) #And I'm pretty sure you 
         # should be able to do this without issue as well. And it _should_ only be 
         # the two queries.