Search code examples
sqlruby-on-railsjoinscope

Scope with different nested joins


My database consists of runners attending sessions

So:

class Runner 
  has_many :attendances
  has_many :attended_happenings, through: :attendances, source: :happening
end
class Attendance < ActiveRecord::Base
  belongs_to :happening
  belongs_to :runner
end
class Happening
  has_many :attendances, dependent: :destroy
  has_many :attendees, through: :attendances, source: :runner
end

I want to find all the runners who have not attended any happenings in the last 9 weeks. In the runner model I've created a scope :inactive_for_nine_weeks :

left_joins(:attendances)
  .joins("join happenings on attendances.happening_id = happenings.id")
  .where("attendances.id is null")
  .where("happenings.started_at > ?", 9.weeks.ago)

I figure I need to left_join on attendances to get the runners that haven't got attendances, and then I need to do a regular join on happenings.

I'm getting no results though when I know there should be, so clearly have executed this poorly.

SQL resulting from getting a count from the scope:

SELECT COUNT(*) FROM "runners" LEFT OUTER JOIN "attendances" ON "attendances"."runner_id" = "runners"."id" join happenings on attendances.happening_id = happenings.id WHERE (attendances.id is null) AND (happenings.started_at > '2023-02-27 08:54:12.460052')

Any tips would be much appreciated!


Solution

  • The most straight forward solution in ActiveRecord is to just to do a subquery:

    class Runner < ApplicationRecord
      has_many :attendances
      has_many :attended_happenings, through: :attendances, source: :happening
    
      # scope is really just syntactic sugar for defining class methods
      # and is really only suited for one liners as multi-line lambas are horrible for readibity
      def self.inactive_since(time = 9.weeks.ago)
        where.not(
          id: Attendance.joins(:happening)
                        .where(happenings: { started_at:  ...time })
                        .select(:runner_id)
        )
      end
    end
    

    Runner.inactive_since gives the following SQL:

    SELECT "runners".* FROM "runners" 
    WHERE "runners"."id" NOT IN (
      SELECT "attendances"."runner_id" FROM "attendances" 
      INNER JOIN "happenings" ON "happenings"."id" = "attendances"."happening_id" 
      AND "happenings"."started_at" > $1
    )
    

    You could also use EXISTS instead of WHERE IN as a slight optimization (on some DBs).

    class Runner < ApplicationRecord
      def self.inactive_exists(time = 9.weeks.ago)
        where.not(
          Attendance.joins(:happening)
                    .select(1)
                    .where.not(Attendance.arel_table[:runner_id].eq(arel_table[:id]))
                    .where(happenings: { started_at: ...time })
                    .arel
                    .exists
        )
      end
    end
    
    SELECT "runners".* FROM "runners" 
    WHERE NOT (
      EXISTS (
       SELECT 1 FROM "attendances" 
       INNER JOIN "happenings" ON "happenings"."id" = "attendances"."happening_id" 
       WHERE "attendances"."runner_id" = "runners"."id" 
       AND "happenings"."started_at" < $1
      )
    )
    

    If you need it you can go for more optimized solutions that are not polyglot like lateral or cross joins or using COUNT(happenings.*) FILTER (WHERE started_at < ?) = 0 on Postgres. But I would start with the subquery and cross that bridge when you get to it.