Search code examples
sqlsqlitegroup-byinner-joinhaving

find an average of a column using group with inner join and then filtering through the groups


I've been trying to solve an sqlite question where I have two tables: Movies and movie_cast.

Movies has the columns: id, movie_title, and `score. Here is a sample of the data:

  • 11|Star Wars|76.496
  • 62|2001:Space Odyssey|39.064
  • 152|Start Trek|26.551

movie_cast has the columns: movie_id, cast_id, cast_name, birthday, popularity. Here is a sample.

  • 11|2|Mark Hamill|9/25/51|15.015

  • 11|3|Harrison Ford|10/21/56|8.905

  • 11|5|Peter Cushing|05/26/13|6.35

IN this case movies.id and movie_cast.movie_id are the same. The question is to Find the top ten cast members who have the highest average movie scores. Do not include movies with score <25 in the average score calculation. ▪ Exclude cast members who have appeared in two or fewer movies.

My query is as below but it doesn't seem to get me the right answer.

SELECT movie_cast.cast_id, 
       movie_cast.cast_name, 
       printf("%.2f",CAST(AVG(movies.score) as float)), 
       COUNT(movie_cast.cast_name)
FROM movies
INNER JOIN movie_cast ON movies.id = movie_cast.movie_id
WHERE movies.score >= 25
GROUP BY movie_cast.cast_id
HAVING COUNT(movie_cast.cast_name) > 2 
ORDER BY AVG(movies.score ) DESC, movie_cast.cast_name ASC
LIMIT 10  

The answers I get are in the format cast_id,cat_name,avg score.

-And example is: 3 Harrison Ford 52.30

I've analyzed and re-analyzed my logic but to no avail. I'm not sure where I'm going wrong. Any help would be great! Thank you!


Solution

  • This is how I would write the query:

    SELECT mc.cast_id, 
           mc.cast_name, 
           PRINTF('%.2f', AVG(m.score)) avg_score
    FROM movie_cast mc INNER JOIN movies m 
    ON m.id = mc.movie_id
    WHERE m.score >= 25
    GROUP BY mc.cast_id, mc.cast_name
    HAVING COUNT(*) > 2
    ORDER BY AVG(m.score) DESC, mc.cast_name ASC
    LIMIT 10;
    

    I use aliases for the tables to shorten the code and make it more readable.
    There is no need to cast the average to a float because the average in SQLite is always a real number.
    Both COUNT(movie_cast.cast_name) can be simplified to COUNT(*) but the 1st one in the SELECT list is not needed by your requirement (if it is then add it).
    The function PRINTF() returns a string, but if you want a number returned then use ROUND():

    ROUND(AVG(m.score), 2) avg_score