Search code examples
sqlsql-servergreatest-n-per-group

SQL Query slows down when more IDs are added


Hello so I am facing an issue with my SQL query. I need to get the average location for each ID on the maximal day. While the query below gives me the correct result it seems that the run time grows exponential depending on the number of IDS. So while it is almost real time for 1 or 2 IDS it takes a few minutes for 3 and very long for 5. So running 5 queries for 5 IDs is a lot faster than running a single one containing all 5. (I have a lot of IDs for which I need to check this, also the database is quite big).

How does this happen and how do I avoid it?

My guess would be that the problem comes from having the same ID list selected twice but I am not sure how to fix it/ if this really is the problem. I toke a look at all the similar issues on stack overflow but I did not find any which was answering my problem.

SELECT t.ID, MAX(date) AS MAX(date), AVG(Latitude) AS lat, AVG(Longitude) AS long
FROM table1 t
INNER JOIN (
    SELECT ID, max(date) AS maxdate
    FROM table1 t2
    WHERE ID IN ('1', '2', '3')
    GROUP BY ID
) tm ON t.ID = tm.ID AND t.date = tm.maxdate
WHERE t.ID IN ('1', '2', '3')
GROUP BY t.ID

Solution

  • You can give window functions a try:

    WITH cte AS (
        SELECT *, RANK() OVER (PARTITION BY id ORDER BY date DESC) AS rn
        FROM t
        WHERE id IN ('1', '2', '3')
    )
    SELECT id, max(date), min(...)
    FROM cte
    WHERE rn = 1
    GROUP BY id