I am trying to understand how MySql processes a date condition differently when I set it on the 'WHERE' part of the code vs the 'HAVING' part. Can you please help me understand what is the difference in the logic in the below cases? How come I get just one product when I apply the date condition using HAVING, but I get two if I use the WHERE?
Question: Write an SQL query that reports the products that were only sold in spring 2019. That is, between 2019-01-01 and 2019-03-31 inclusive.
+------------+--------------+------------+
| product_id | product_name | unit_price |
+------------+--------------+------------+
| 1 | S8 | 1000 |
| 2 | G4 | 800 |
| 3 | iPhone | 1400 |
+------------+--------------+------------+
Sales table:
+-----------+------------+----------+------------+----------+-------+
| seller_id | product_id | buyer_id | sale_date | quantity | price |
+-----------+------------+----------+------------+----------+-------+
| 1 | 1 | 1 | 2019-01-21 | 2 | 2000 |
| 1 | 2 | 2 | 2019-02-17 | 1 | 800 |
| 2 | 2 | 3 | 2019-06-02 | 1 | 800 |
| 3 | 3 | 4 | 2019-05-13 | 2 | 2800 |
+-----------+------------+----------+------------+----------+-------+
HAVING OPTION (the correct one)
FROM Sales s
JOIN Product p
ON s.product_id = p.product_id
GROUP by s.product_id
HAVING min(sale_date)>='2019-01-01' AND max(sale_date)<='2019-03-31'
HAVING RESULT {"headers": ["product_id", "product_name"], "values": [[1, "S8"]]}
WHERE OPTION
FROM Sales s
JOIN Product p
ON s.product_id = p.product_id
WHERE sale_date BETWEEN '2019-01-01' AND '2019-03-31'
GROUP by s.product_id
WHERE RESULT {"headers": ["product_id", "product_name"], "values": [[1, "S8"], [2, "G4"]]}
Where does that 2, G4 come from? (Apologies in advance if this is trivial, I am genuinely trying to learn on my own and I don't have anyone to ask)
The WHERE
is filtering the rows to only consider sales in Jan, Feb, and March. So, any product that has sales in that period are included.
On the other hand, the HAVING
is considering all sales for a product. Then for each product it is looking at the minimum and maximum dates. Only products that have all sales in Jan, Feb, and March are included.
Consider product_id = 2
. It has a sale in February, so it would be included by the WHERE
clause. However, it also has a sale in June. So the maximum date is Jun -- and the HAVING
clause filters it out.