Search code examples
mysqlnorthwind

Finding all combinations of two products that never were ordered together


I have an array in SQL containing all orders and all products from those orders and I need to find all pairs of products that have never been ordered together in one order.

I started with something like this

SELECT a.orderid, a.productid, b.OrderID, b.ProductID from [Order Details] a cross join [Order Details] b
except  
SELECT a.orderid, a.productid, b.OrderID, b.ProductID from [Order Details] a cross join [Order Details] b
where a.ProductID=b.ProductID

but I don't know how to eliminate rest of the possible pairs so that I have only the ones I need left.

EDIT: Modified the query a bit, went for a diffrent approach. I'm getting closer but still don't know how to get rid of repetitions like ab and ba

select p1.productid, p2.productid from products p1 join products p2 on p1.productid<>p2.productid
except
select a.productid, b.productid from [Order Details] a join [Order Details] b on a.ProductID<>b.ProductID
where a.OrderID=b.OrderID

Solution

  • Use a left join and filter on missed joins:

    select p1.productid, p2.productid
    from products p1
    join products p2 on p1.productid < p2.productid
    left join [Order Details] o1 on o1.productid = p1.productid
    left join [Order Details] o2 on o2.productid = p2.productid
        and o2.OrderID = o1.OrderID
    where o2.OrderID is null
    

    This works because missed joins have all nulls in the row, and while the join condition is applied during the join, the where clause is applied after the join, so specifying that a joined column (that can't in reality be null) is null leaves only missed joins.

    Another small but important point is the use of less than rather than not equals in the join condition when joining the product table to itself, which prevents products joining to themselves, but importantly prevents two products joining to themselves twice - which ultimately means more efficiency (half as many joins) and not having to use distinct to remove duplicate combinations.