I have two tables, a customers
and orders
table.
The customers
table contains a unique ID
for each customer. It contains 1141 entries.
The orders
table contains many entries with a customerID
and a date
.
I am trying to query my database and return a list of customers and the max(date)
from the orders list.
SELECT *
FROM customers
INNER JOIN
(
SELECT CustomerID, max(date) as date
FROM orders
GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.date = Sub1.Date
However this query is returning 1726 rows instead of 1141 rows. Where is this getting extra from?
The problem is that there are ties.
For a given customer, some place more than one order per day. So there's a possibility that occasionally some may have placed more than one order on the date that is their max date.
To fix this, you need to use MAX()
or some column that is always unique in the Orders table (or at least unique within a given date). This is easy if you can depend on an auto-increment primary key in the Orders table:
SELECT *
FROM customers
INNER JOIN
(
SELECT CustomerID, max(orderid) as orderid as date
FROM orders
GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.orderid = Sub1.orderid
This assumes that orderid increases in lock-step with increasing dates. That is, you'll never have an order with a greater auto-inc id but an earlier date. That might happen if you allow data to be entered out of chronological order, e.g. back-dating orders.