Search code examples
sqlsql-order-bydistinct

DISTINCT and ORDER BY in the same command without using the order by variable


I need to select the recent 5 games the user has gone into but I don't want there to be duplicates. So I need a SQL line that will take them in order from most recent to least but also make sure there are no distinct. To order them I have a time variable, but if I use:

select distinct name, image, time 
from history 
where userID = USERID 
order by time desc

this removes all the other times apart from the first ones meaning that I will get the result in the order of which game was played first ever.

If you still don't understand here is the schema and some of the data I have:

Schema:

CREATE TABLE history (userID integer, name text not null, image text not null, time text not null);

Data:

6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 07:59:00  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:01:04  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:01:08  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:01:12  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:02:13  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:02:16  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:03:16  
6 | Infinite Spinner | infiniteSpinner | 2021-04-17 08:03:18  
6 | Arcade Mathematicians | arcadeMathematicians | 2021-04-17 08:03:21 

So if you see here if I use distinct on time it won't work, as everything would still be displayed.


Solution

  • Use ROW_NUMBER() window function to rank each row of the table based on user and game and ordered by time descending.
    From these results keep only the rows with rank = 1, meaning the latest rows for each combination of user and game.
    Then use ROW_NUMBER() again to rank the remaining rows based on user and ordered by time descending and from the results keep only the rows with rank <= 5:

    WITH 
      cte1 AS (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY userid, name ORDER BY time DESC) rn1
        FROM history
      ),
      cte2 AS (
        SELECT *, ROW_NUMBER() OVER (PARTITION BY userid ORDER BY time DESC) rn2
        FROM cte1
        WHERE rn1 = 1
      )
    SELECT userid, name, image, time 
    FROM cte2
    WHERE rn2 <= 5 
    

    If you want results for a specific user it is easier, just use MAX() window function and sort the results to get the top 5 rows:

    SELECT DISTINCT userid, name, image, 
           MAX(time) OVER (PARTITION BY name) time 
    FROM history
    WHERE userid = 6
    ORDER BY time DESC LIMIT 5
    

    Or if the image may be different for the same name form row to row:

    SELECT DISTINCT userid, name, 
           FIRST_VALUE(image) OVER (PARTITION BY name ORDER BY time DESC) image, 
           MAX(time) OVER (PARTITION BY name) time 
    FROM history
    WHERE userid = 6
    ORDER BY time DESC LIMIT 5
    

    See the demo.
    Results (for your sample data):

    userid name image time
    6 Arcade Mathematicians arcadeMathematicians 2021-04-17 08:03:21
    6 Infinite Spinner infiniteSpinner 2021-04-17 08:03:18