Search code examples
mysqlsqljoingreatest-n-per-groupmaxdate

Get the max column value for each unique ID


I have two tables, a customers and orders table.

The customers table contains a unique ID for each customer. It contains 1141 entries.

The orders table contains many entries with a customerID and a date.

I am trying to query my database and return a list of customers and the max(date) from the orders list.

SELECT *
FROM customers
INNER JOIN
(
    SELECT CustomerID, max(date) as date
    FROM orders
    GROUP BY CustomerID
) Sub1
ON customers.id = Sub1.CustomerID
INNER JOIN orders
ON orders.CustomerID = Sub1.CustomerID
AND orders.date = Sub1.Date

However this query is returning 1726 rows instead of 1141 rows. Where is this getting extra from?


Solution

  • The problem is that there are ties.

    For a given customer, some place more than one order per day. So there's a possibility that occasionally some may have placed more than one order on the date that is their max date.

    To fix this, you need to use MAX() or some column that is always unique in the Orders table (or at least unique within a given date). This is easy if you can depend on an auto-increment primary key in the Orders table:

    SELECT *
    FROM customers
    INNER JOIN
    (
        SELECT CustomerID, max(orderid) as orderid as date
        FROM orders
        GROUP BY CustomerID
    ) Sub1
    ON customers.id = Sub1.CustomerID
    INNER JOIN orders
    ON orders.CustomerID = Sub1.CustomerID
    AND orders.orderid = Sub1.orderid
    

    This assumes that orderid increases in lock-step with increasing dates. That is, you'll never have an order with a greater auto-inc id but an earlier date. That might happen if you allow data to be entered out of chronological order, e.g. back-dating orders.