I have been practising SQL, and came across this behaviour i couldnt explain. ( I am also the one who asked this question : Over() function does not cover all rows in the table) -> its a different problem.
Suppose i have a table like this
movie_id | user_id | rating | created_at |
---|---|---|---|
1 | 1 | 3 | 2020-01-12 |
1 | 2 | 4 | 2020-02-11 |
1 | 3 | 2 | 2020-02-12 |
1 | 4 | 1 | 2020-01-01 |
2 | 1 | 5 | 2020-02-17 |
2 | 2 | 2 | 2020-02-01 |
2 | 3 | 2 | 2020-03-01 |
3 | 1 | 3 | 2020-02-22 |
3 | 2 | 4 | 2020-02-25 |
What I am trying to do, is to rank the movie by rating, which i have this SQL query:
SELECT
movie_id,
rank() over(partition by movie_id order by avg(rating) desc) as rank_rate
FROM
MovieRating
From my previous question, i learnt that the over() function will operate in a window selected by the query, basically the window this query returns:
SELECT movie_id FROM MovieRating
So I would expect to see at least 3 rows here, for id 1, 2 and 3.
The result is however just one row:
{"headers": ["movie_id", "rank_rate"], "values": [[1, 1]]}
Why is that ? Is something wrong with my understanding regarding how over()
function works ?
You need an aggregation query and use RANK()
window function on its results:
SELECT movie_id,
AVG(rating) AS average_rating, -- you may remove this line if you don't actually need the average rating
RANK() OVER (ORDER BY AVG(rating) DESC) AS rank_rate
FROM MovieRating
GROUP BY movie_id
ORDER BY rank_rate;
See the demo.
Your query is an aggregation query without a group by
clause and this means that it operates on the whole table and not to each movie_id
. Such queries return only 1 row with the result of the aggregation.
When yo apply RANK()
window function, it will operate on that single row and not on the table.