Search code examples
sqlruby-on-railspostgresqlrails-activerecordrelational-division

SQL where joined set must contain all values but may contain more


I have three tables offers, sports and the join table offers_sports.

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
end

class Sport < ActiveRecord::Base
  has_and_belongs_to_many :offers
end

I want to select offers that include a given array of sport names. They must contain all of the sports but may have more.

Lets say I have these three offers:

light:
  - "Yoga"
  - "Bodyboarding"
medium:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
all:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
  - "Parasailing"
  - "Skydiving"

Given the array ["Bodyboarding", "Surfing"] I would want to get medium and all but not light.

I have tried something along the lines of this answer but I get zero rows in the result:

Offer.joins(:sports)
     .where(sports: { name: ["Bodyboarding", "Surfing"] })
     .group("sports.name")
     .having("COUNT(distinct sports.name) = 2")

Translated to SQL:

SELECT "offers".* 
FROM "offers" 
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"     
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id" 
  WHERE "sports"."name" IN ('Bodyboarding', 'Surfing') 
GROUP BY sports.name 
HAVING COUNT(distinct sports.name) = 2;

An ActiveRecord answer would be nice but I'll settle for just SQL, preferably Postgres compatible.

Data:

offers
======================
id | name
----------------------
1  | light
2  | medium
3  | all
4  | extreme

sports
======================
id | name
----------------------
1  | "Yoga"
2  | "Bodyboarding"
3  | "Surfing"
4  | "Parasailing"
5  | "Skydiving"

offers_sports
======================
offer_id | sport_id
----------------------
1        | 1
1        | 2
2        | 1
2        | 2
2        | 3
3        | 1
3        | 2
3        | 3
3        | 4
3        | 5
4        | 3
4        | 4
4        | 5

Solution

  • Group by offer.id, not by sports.name (or sports.id):

    SELECT o.*
    FROM   sports        s
    JOIN   offers_sports os ON os.sport_id = s.id
    JOIN   offers        o  ON os.offer_id = o.id
    WHERE  s.name IN ('Bodyboarding', 'Surfing') 
    GROUP  BY o.id  -- !!
    HAVING count(*) = 2;
    

    Assuming the typical implementation:

    • offer.id and sports.id are defined as primary key.
    • sports.name is defined unique.
    • (sport_id, offer_id) in offers_sports is defined unique (or PK).

    You don't need DISTINCT in the count. And count(*) is even a bit cheaper, yet.

    Related answer with an arsenal of possible techniques:


    Added by @max (the OP) - this is the above query rolled into ActiveRecord:

    class Offer < ActiveRecord::Base
      has_and_belongs_to_many :sports
      def self.includes_sports(*sport_names)
        joins(:sports)
          .where(sports: { name: sport_names })
          .group('offers.id')
          .having("count(*) = ?", sport_names.size)
      end
    end