Search code examples
algorithmsocial-networkingvotingscoring

Scoring algorithms: how to convert the number & % of "Likes" & "Dislikes" into a single score?


I have a website where users can "Like" and "Dislike" items.

So for each item, I have data such as the total number of "Likes" and the % of total votes that are "Likes".

I'd like to calculate just a single score to show to users. Using just % wouldn't work because even though item_A might have a 90% of "Likes" while item_B might have a 80% of "Likes", item_B should still rank in front of item_A if item_B has 10,000 total votes while item_A only has 1,000 total votes.

Likewise using just total "Likes" wouldn't work because while an item might have a large number of "Likes" it shouldn't be ranked very high if the % of "Likes" is low.

What would be a good algorithm to create a single score out of the data above?

Ideally the score should be "meaningful" or "normalized" in some way. For example if I go to IMDB and I see that a movie has a score of 8/10, I'd immediately know that it is a good movie. On the other hand if I see a score of 1,370 I wouldn't necessarily know if that is good or bad.


Solution

  • There's a couple of very good articles on how Reddit does this sort of ranking here, and here. In a nutshell, rank posts by the lower end of the 90% confidence interval of their scores. Entries with fewer votes have larger confidence intervals, and hence tend to rank lower than entries with more votes but the same average.