I know similar questions like this have been asked before, but I have not seen one for more than 2 tables. And there seems to be a difference.
I have three tables from which I need fields, customers
where I need customerID
and orderID
from, orders
from which I get customerID
and orderID
and lineitems
from which I get orderID
and quantity
(= quantity ordered).
I want to find out how many customers bought more than 2 of the same item, so basically quantity > 2 with:
SELECT COUNT(DISTINCT custID)
FROM customers
WHERE EXISTS(
SELECT *
FROM customers C, orders O, lineitems L
WHERE C.custID = O.custID AND O.orderID = L.orderID AND L.quantity > 2
);
I do not understand why it is returning me the count of all rows. I am correlating the subqueries before checking the >2 condition, am I not?
I am a beginner at SQL, so I'd be thankful if you could explain it to me fundamentally, if necessary. Thanks.
You don't have to repeat customers
table in the EXISTS
subquery. This is the idea of correlation: use the table of the outer query in order to correlate.
SELECT COUNT(DISTINCT custID)
FROM customers c
WHERE EXISTS(
SELECT *
FROM orders O
JOIN lineitems L ON O.orderID = L.orderID
WHERE C.custID = O.custID AND L.quantity > 2
);