I have a table with a composite primary key (ID
, Date
) like below.
+------+------------+-------+ | ID | Date | Value | +------+------------+-------+ | 1 | 1433419200 | 15 | | 1 | 1433332800 | 23 | | 1 | 1433246400 | 41 | | 1 | 1433160000 | 55 | | 1 | 1432900800 | 24 | | 2 | 1433419200 | 52 | | 2 | 1433332800 | 23 | | 2 | 1433246400 | 39 | | 2 | 1433160000 | 22 | | 3 | 1433419200 | 11 | | 3 | 1433246400 | 58 | | ... | ... | ... | +------+------------+-------+
There is also a separate index on Date
column. The table is of moderate size, currently ~600k row and growing by ~2k everyday.
I want to do a single SELECT query that returns the latest 3 records (ordered by Date
timestamp) for each ID
. For each given ID
, the Date
values are always unique, so no need to worry about ties for Date
here.
I've tried a self-join approach, inspired by this answer, but it took quite a few seconds to run and returned nothing:
SELECT p1.ID, p1.Date, p1.Value FROM MyTable AS p1
LEFT JOIN MyTable AS p2
ON p1.ID=p2.ID AND p1.Date<=p2.Date
GROUP BY p1.ID
HAVING COUNT(*)<=5
ORDER BY p1.ID, p1.Date DESC;
What would be a fast solution here?
You could look up the three most recent dates for each ID:
SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 3)
Alternatively, look up the third most recent date for each ID, and use it as a limit:
SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 1 OFFSET 2),
0)
Both queries should get good performance from the primary key's index.