Search code examples
sqlruby-on-railsactiverecordmany-to-manyrelational-database

Rails Many-to-many relationship with extension generating incorrect SQL


I'm having an issue where a many-to-many relationship with an "extension" is generating incorrect SQL.

class OrderItem < ApplicationRecord
  belongs_to :buyer, class_name: :User
  belongs_to :order
  belongs_to :item, polymorphic: true
end

class User < ApplicationRecord
  has_many :order_items_bought,
    -> { joins(:order).where.not(orders: { state: :expired }).order(created_at: :desc) },
    foreign_key: :buyer_id,
    class_name: :OrderItem

  has_many :videos_bought,
    -> { joins(:orders).select('DISTINCT ON (videos.id) videos.*').reorder('videos.id DESC') },
    through: :order_items_bought,
    source: :item,
    source_type: :Video do
      def confirmed
        where(orders: { state: :confirmed })
      end
    end
end

user.videos_bought.confirmed generates this SQL:

Video Load (47.0ms)  SELECT  DISTINCT ON (videos.id) videos.* FROM 
"videos" INNER JOIN "order_items" "order_items_videos_join" ON 
"order_items_videos_join"."item_id" = "videos"."id" AND 
"order_items_videos_join"."item_type" = $1 INNER JOIN 
"orders" ON "orders"."id" = "order_items_videos_join"."order_id" INNER JOIN 
"order_items" ON "videos"."id" = "order_items"."item_id" WHERE 
"order_items"."buyer_id" = $2 AND ("orders"."state" != $3) AND "order_items"."item_type" = $4 AND 
"orders"."state" = $5 ORDER BY videos.id DESC, "order_items"."created_at" DESC LIMIT $6

Which returns some Video records which are joined with orders that do NOT have state confirmed. I would expect all orders to have state confirmed.

If I use raw SQL everything works fine:

has_many :videos_bought,
  -> {
    joins('INNER JOIN orders ON orders.id = order_items.order_id')
      .select('DISTINCT ON (videos.id) videos.*')
      .reorder('videos.id DESC')
  },
  through: :order_items_bought,
  source: :item,
  source_type: :Video do
    def confirmed
      where(orders: { state: :confirmed })
    end
end

Now user.videos_bought.confirmed generates this SQL:

Video Load (5.4ms)  SELECT  DISTINCT ON (videos.id) videos.* FROM 
"videos" INNER JOIN "order_items" ON 
"videos"."id" = "order_items"."item_id" INNER JOIN orders ON 
orders.id = order_items.order_id WHERE 
"order_items"."buyer_id" = $1 AND ("orders"."state" != $2) AND
"order_items"."item_type" = $3 AND "orders"."state" = $4 ORDER BY
videos.id DESC, "order_items"."created_at" DESC LIMIT $5

Which seems more succinct because it avoids the auto generated order_items_videos_join name. It also only returns orders that have state confirmed.

Any idea what is going on? Does ActiveRecord just generate faulty SQL sometimes?

Using rails 5.1.5. Upgrading to latest made no difference.

I'm hoping to get an explanation on why Rails generates the order_items_videos_join string in the first case but not in the second case. Also, why the second SQL query produces incorrect results. I can edit the question with more code and data samples if needed.


Solution

  • ActiveRecord does not just generate faulty SQL sometimes, but there's a little nuance to it such that starting simple is best when it comes to defining relationships. For example, let's rework your queries to get that DISTINCT ON out of there. I've never seen a need to use that SQL clause.

    Before chaining highly customized association logic, let's just see if there's simpler way to query first, and then check to see whether there's a strong case for turning your queries into associations.

    Looks like you've got a schema like this:

    User

    • id

    Order

    • id
    • state

    OrderItem

    • id
    • order_id
    • buyer_id (any reason this is on OrderItem and not on Order?)
    • item_type (Video)
    • item_id (videos.id)

    Video

    • id

    A couple of tidbits

    • No need to create association extensions for query conditions that would make perfectly good scopes on the model. See below.

    A perfectly good query might look like

    Video.joins(order_item: :order).
        where(order_items: {
           buyer_id: 123, 
           order: {
              state: 'confirmed'
           }).
        # The following was part of the initial logic but
        #   doesn't alter the query results.
        # where.not(order_items: {
        #    order: {state: 'expired'}
        # }).
        order('id desc')
    

    Here's another way:

    class User < ActiveRecord::Base
      has_many :order_items, foreign_key: 'buyer_id'
      def videos_purchased
        Video.where(id: order_items.videos.confirmed.pluck(:id))
      end
    end
    
    class OrderItem < ActiveRecord::Base
      belongs_to :order, class_name: 'User', foreign_key: 'buyer_id'
      belongs_to :item, polymorphic: true
      scope :bought, -> {where.not(orders: {state: 'cancelled'})}
      scope :videos, -> {where(item_type: 'Video')}
    end
    
    class Video < ActiveRecord::Base
      has_many :order_items, ...
    
      scope :confirmed, -> {where(orders: {state: 'confirmed'})}
    end
    
    ...
    user = User.first()
    user.videos_purchased
    

    I might have the syntax a little screwy when it comes to table and attribute names, but this should be a start.

    Notice that I changed it from one to two queries. I suggest running with that until you really notice that you have a performance problem, and even then you might have an easier time just caching queries, than trying to optimize complex join logic.