Search code examples
sqlruby-on-railsrubypostgresqlrails-postgresql

Rails: Optimize querying maximum values from associated table


I need to show a list of partners and the maximum value from the reservation_limit column from Klass table.

Partner  has_many    :klasses
Klass    belongs_to  :partner

# Partner controller
def index
  @partners = Partner.includes(:klasses)
end

# view
<% @partners.each do |partner| %>
  Up to <%= partner.klasses.maximum("reservation_limit") %> visits per month
<% end %>

Unfortunately the query below runs for every single Partner.

SELECT MAX("klasses"."reservation_limit") FROM "klasses" WHERE "klasses"."partner_id" = $1  [["partner_id", 1]]

If there are 40 partners then the query will run 40 times. How do I optimize this?


edit: Looks like there's a limit method in rails so I'm changing the limit in question to reservation_limit to prevent confusion.


Solution

  • You can use two forms of SQL to efficiently retrieve this information, and I'm assuming here that you want a result for a partner even where there is no klass record for it

    The first is:

       select partners.*,
              max(klasses.limit) as max_klasses_limit
         from partners
    left join klasses on klasses.partner_id = partners.id
     group by partner.id
    

    Some RDBMSs require that you use "group by partner.*", though, which is potentially expensive in terms of the required sort and the possibility of it spilling to disk.

    On the other hand you can add a clause such as:

    having("max(klasses.limit) > ?", 3)
    

    ... to efficiently filter the partners by their value of maximum klass.limit

    The other is:

       select partners.*,
              (Select max(klasses.limit)
                 from klasses
                where klasses.partner_id = partners.id) as max_klasses_limit
         from partners
    

    The second one does not rely on a group by, and in some RDBMSs may be effectively transformed internally to the first form, but may execute less efficiently by the subquery being executed once per row in the partners table (which would stil be much faster than the raw Rails way of actually submitting a query per row).

    The Rails ActiveRecord forms of these would be:

    Partner.joins("left join klasses on klasses.partner_id = partners.id").
            select("partners.*, max(klasses.limit) as max_klasses_limit").
            group(:id)
    

    ... and ...

    Partner.select("partners.*, (select max(klasses.limit)
                   from klasses
                   where klasses.partner_id = partners.id) as max_klasses_limit")
    

    Which of these is actually the most efficient is probably going to depend on the RDBMS and even the RDBMS version.

    If you don't need a result when there is no klass for the partner, or there is always guaranteed to be one, then:

    Partner.joins(:klasses).
            select("partners.*, max(klasses.limit) as max_klasses_limit").
            group(:id)
    

    Either way, you can then reference

    partner.max_klasses_limit