Search code examples
mysqlsqlgreatest-n-per-group

Retrieving latest dates grouped by a key column value (MySql)


Given the following table with purchase data.

CREATE TABLE myTable (
 id INT NOT NULL AUTO_INCREMENT,
 date DATETIME NOT NULL,
 subNo SMALLINT NOT NULL,
 poNo INT NOT NULL,
 PRIMARY KEY (id))

INSERT INTO myTable VALUES (0, '2022-11-01 12:43', 1, 800), (0, '2022-11-02 13:00', 1, 800), (0, '2022-11-03 12:43', 2, 800), (0, '2022-11-03 14:00', 1, 923), (0, '2022-11-03 15:00', 2, 800), (0, '2022-11-04 12:43', 1, 800)
 Id |      Date        | SubNo | PO# |
----|------------------|-------|-----|
100 | 2022-11-01 12:43 |   1   | 800 |
101 | 2022-11-02 13:00 |   1   | 800 |
102 | 2022-11-03 12:43 |   2   | 800 |
103 | 2022-11-03 14:00 |   1   | 923 |
104 | 2022-11-03 15:00 |   2   | 800 |
105 | 2022-11-04 12:43 |   1   | 800 |

SubNo is the ordinal number of a subset or partial quantity of the purchase (PO#). There can be more than 30 subsets to a purchase.

I am looking for a query supplying for a given purchase for each of its subsets the latest date.
For PO 800 it would look like this:

 Id |       Date       | SubNo | PO# |
----|------------------|-------|-----|
105 | 2022-11-04 12:43 |   1   | 800 |
104 | 2022-11-03 15:00 |   2   | 800 |

I haven't found a way to filter the latest dates. A rough approach is

SELECT id, date, subNo
FROM myTable
WHERE poNo=800
GROUP BY subNo
ORDER BY subNo, date DESC

but DISTINCT and GROUP BY do not guarantee to return the latest date.

Then I tried to create a VIEW first, to be used in a later query.

CREATE VIEW myView AS
SELECT subNo s, (SELECT MAX(date) FROM myTable WHERE poNo=800 AND subNo=s) AS dd
FROM myTable
WHERE poNo=800
GROUP BY s

But although the query is ok, the result differs when used for a VIEW, probably due to VIEW restrictions.

Finally I tried a joined table

SELECT id, datum, subNo s
FROM myTable my JOIN (SELECT MAX(date) AS d FROM myTable WHERE poNo=800 AND subNo=s) tmp ON my.date=tmp.d
WHERE poNo=800

but getting the error "Unknown column 's' in where clause.

My MySql version is 8.0.22


Solution

  • You can check if (date, subno) corresponds to one of the pairs of ( MAX(date), subno) :

    SELECT id, date, subno
    FROM mytable
    WHERE pono = 800 AND (date, subno) IN (
        SELECT MAX(date), subno
        FROM mytable
        WHERE pono = 800
        GROUP BY subno
    )
    GROUP BY subno;
    

    My result in a clean table :

    +----+---------------------+-------+
    | id | date                | subno |
    +----+---------------------+-------+
    |  6 | 2022-11-04 12:43:00 |     1 |
    |  5 | 2022-11-03 15:00:00 |     2 |
    +----+---------------------+-------+
    

    Depending on how you want to to manage multiple rows being the max with the same subno, you might want to remove the last GROUP BY subno. With it, it only shows one of them. Without, it shows all the duplicated max rows.