Search code examples
sortingstatisticsuser-experienceratingbayesian

What is a better way to sort by a 5 star rating?


I'm trying to sort a bunch of products by customer ratings using a 5 star system. The site I'm setting this up for does not have a lot of ratings and continue to add new products so it will usually have a few products with a low number of ratings.

I tried using average star rating but that algorithm fails when there is a small number of ratings.

Example a product that has 3x 5 star ratings would show up better than a product that has 100x 5 star ratings and 2x 2 star ratings.

Shouldn't the second product show up higher because it is statistically more trustworthy because of the larger number of ratings?


Solution

  • Prior to 2015, the Internet Movie Database (IMDb) publicly listed the formula used to rank their Top 250 movies list. To quote:

    The formula for calculating the Top Rated 250 Titles gives a true Bayesian estimate:

    weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C
    

    where:

    • R = average for the movie (mean)
    • v = number of votes for the movie
    • m = minimum votes required to be listed in the Top 250 (currently 25000)
    • C = the mean vote across the whole report (currently 7.0)

    For the Top 250, only votes from regular voters are considered.

    It's not so hard to understand. The formula is:

    rating = (v / (v + m)) * R +
             (m / (v + m)) * C;
    

    Which can be mathematically simplified to:

    rating = (R * v + C * m) / (v + m);
    

    The variables are:

    • R – The item's own rating. R is the average of the item's votes. (For example, if an item has no votes, its R is 0. If someone gives it 5 stars, R becomes 5. If someone else gives it 1 star, R becomes 3, the average of [1, 5]. And so on.)
    • C – The average item's rating. Find the R of every single item in the database, including the current one, and take the average of them; that is C. (Suppose there are 4 items in the database, and their ratings are [2, 3, 5, 5]. C is 3.75, the average of those numbers.)
    • v – The number of votes for an item. (To given another example, if 5 people have cast votes on an item, v is 5.)
    • m – The tuneable parameter. The amount of "smoothing" applied to the rating is based on the number of votes (v) in relation to m. Adjust m until the results satisfy you. And don't misinterpret IMDb's description of m as "minimum votes required to be listed" – this system is perfectly capable of ranking items with less votes than m.

    All the formula does is: add m imaginary votes, each with a value of C, before calculating the average. In the beginning, when there isn't enough data (i.e. the number of votes is dramatically less than m), this causes the blanks to be filled in with average data. However, as votes accumulates, eventually the imaginary votes will be drowned out by real ones.

    In this system, votes don't cause the rating to fluctuate wildly. Instead, they merely perturb it a bit in some direction.

    When there are zero votes, only imaginary votes exist, and all of them are C. Thus, each item begins with a rating of C.

    See also: