Search code examples
mysqlsqlrelational-divisionnot-exists

Division query (WHERE NOT EXISTS) on 4 tables to return all customers purchased all the products


I would like to use a division query to have all the customers who bought all products.

The template is the following :

SELECT *
FROM Customers AS A
WHERE NOT EXISTS
(
    SELECT *
    FROM Products AS B
    WHERE NOT EXISTS
    (
        SELECT *
        FROM Purchases AS C
        WHERE C.CustomerID= A.CustomerID
        AND C.ProductID= B.ProductID
    ));

It's well explained in this topic: https://stackoverflow.com/a/71877187/17783040

My issue is I don't have 3 tables to join but 4. The division query is supposed to work the same but I can't succeed it.

So my 4 tables:

  • Customers: PK is CustomerID
  • Products: PK is ProductID
  • Invoices: PK is InvoiceID and FK is CustomerID
  • InvoicesLines: FK are InvoiceID and ProductID

I would like to practice this kind of query even if it's an hard way. I really understood the division query working for 3 tables in a many to many relationship.

But in my case, I need a 4th table to join CustomerID to ProductID because the intermediate table as InvoiceLine collects each line of the invoice for a particular ProductID and quatity purchased (+InvoiceID). Then the Invoice table contains only the InvoiceID and CustomerID.

For example with data:

Customers Table:

CustomerID  name        surname address
1           Charles     Smith   123 main street
2           Henry       Johnson 546 Drive
3           Jennifer    Davis   65 Avenue

Products Table:

ProductID   name    quantity
1           pc      12
2           usb     56
3           tv      67

Invoices Table:

InvoiceID   CustomerID
1           1
2           3
3           2
4           3
5           1

InvoiceLines Table:

InvoicelineID   ProductID   description InvoiceID
1               2           blablabla   1
2               1           blablabla   2
3               2           blablabla   2
4               1           blablabla   2
5               2           blablabla   3
6               1           blablabla   3
7               3           blablabla   4
8               3           blablabla   5

So it's the CustomerID #3 who have bought all the products (1,2 and 3) that we can see in the InvoiceLines table for the IncoiceID #2 and #4

This following query returns all the customers in the database without any restriction and not only the one who purchased all the products in catalogue:

SELECT *
FROM Customers AS C
WHERE NOT EXISTS
(
    SELECT *
    FROM Products AS P
    WHERE NOT EXISTS
    (
        SELECT *
        FROM Invoices AS I
        WHERE NOT EXISTS
            (
                SELECT *
                FROM InvoiceLines AS L
                WHERE I.CustomerID= C.CustomerID
                AND L.InvoiceID= I.InvoiceID
                AND L.ProductID= P.ProductID
            )));

It's other try is not working either:

SELECT *
FROM Customers AS C
WHERE NOT EXISTS
(
    SELECT *
    FROM Products AS P
    WHERE NOT EXISTS
    (
        SELECT *
        FROM 
            (
            SELECT * FROM InvoiceLines AS L, Invoices AS I
            WHERE  L.InvoiceID= I.InvoiceID
            ) AS Pu
        WHERE Pu.CustomerID= C.CustomerID
        AND Pu.ProductID= P.ProductID
        )
    );

Do you have any hint or advice to make it work using the division query template?


Solution

  • The logic in your last query is correct, only that you need to replace select * with select l.ProductID, i.CustomerID (otherwise you will get "duplicate column name" errors):

    SELECT *
    FROM Customers AS C
    WHERE NOT EXISTS
    (
        SELECT *
        FROM Products AS P
        WHERE NOT EXISTS
        (
            SELECT *
            FROM 
                (
                SELECT L.ProductID, I.CustomerID
                FROM InvoiceLines AS L
                JOIN Invoices AS I ON L.InvoiceID = I.InvoiceID
                ) AS Pu
            WHERE Pu.CustomerID = C.CustomerID
            AND Pu.ProductID = P.ProductID
        )
    )