What is the difference between this:
User.includes(:sites).where("sites.name = 'test'").references(:sites)
and this:
User.joins(:sites).where(sites: { name: "test"} )
Is there a general consensus on using one method over the other?
They do completely different things.
.joins(:sites)
adds a INNER JOIN
to the relation which limits the results to users with at least one site before the where clause is applied.
No columns off the joined table will be selected (unless you do so manually). Use it when you just want to filter the results with a join but don't care about the data in the related table or with .select
to select aggregates.
It does not prevent N+1 queries when you access the association:
users = User.joins(:sites).where(sites: { name: "test" } )
# SELECT "users".*
# FROM "users"
# INNER JOIN "sites"
# ON "sites"."user_id" = "users"."id"
# WHERE "sites"."name" = 'test'
users.each do |user|
user.sites.each do |site|
puts site.name # creates an additional database query for every site
end
end
.includes
avoids N+1 queries in a "smart" way. It's extremely overused and you're usually better off just thinking about how you want to use the data and using the methods it delegates to directly.
It will try to use .preload
to do two queries where the first would select the data off the users table and the second query will select data off the sites
tables when the assocation is first accessed on any member of the collection. This will avoid loading the data unless it's actually used while at the same time preventing a N+1 query for every member of the association.
users = User.includes(:sites) # SELECT "users".* FROM "users"
users.each do |user|
user.sites.each do |site|
puts site.name # only one additional database to fetch the sites is fired
end
end
However if the query references the sites table through the where clause .preload
won't work as it's not actually joining anything. .includes
is actually smart enough to detect this in most cases and will use .eager_load
to load everything in a single query with an outer join.
.references
explicitly tells .includes
to use .eager_load
and in that example will prevent a database driver that occurs when the query doesn't join sites
.
User.includes(:sites).where("sites.name = 'test'")
# User Load (0.8ms) SELECT "users".* FROM "users" WHERE (sites.name = 'test') /* loading for pp */ LIMIT 11 /*application='Sandbox8'*/
# An error occurred when inspecting the object: #<ActiveRecord::StatementInvalid:"SQLite3::SQLException: no such column: sites.name:\nSELECT \"users\".* FROM \"users\" WHERE (sites.name = 'test') /* loading for pp */ LIMIT 11 /*application='Sandbox8'*/\n
But if you just pass a hash to .where
instead of actively sabotaging it with a SQL string for the sake of the example it does just fine:
User.includes(:sites).where(sites: { name: "test" })
It's even more pointless if you consider that you could just cut out the middle man instead of being so indecisive.
# This runs just fine
User.eager_load(:sites).where("sites.name = 'test'")
# But this is preferable
User.eager_load(:sites).where(sites: { name: "test" })
I guess .references
could still fill a very niche role if you're building a query programatically and want to use the lazy loading nature of .includes
but at the same time avoid a reference error (this scenario took a lot of imagination). But in general - YAGNI.