Search code examples
sqlpostgresqlrating

bayesan formula sql: rating system with weight


today i posted a question if bayesan was a good method in my case to do a rating system weight. since i got no answers i tried to implement if my self.

in my database users can add a vote from 1 to 5 stars to every groups. then i have to display a leaderboard by those votes. what i was doing until now is to order them by votes average without a weight. this is not so nice because a group having 5.0 with 20 votes is before of a group having 4.9 avg and 10000 votes.

this is my table:

CREATE TABLE IF NOT EXISTS votes(
user_id BIGINT,
group_id BIGINT,
vote SMALLINT,
vote_date timestamp,
PRIMARY KEY (user_id, group_id)

this is a query i tried to make and it seems to work:

SELECT 
    v.group_id,
    s_ref.title, 
    s_ref.username, 
    COUNT(vote) AS amount, 
    ROUND(AVG(vote), 1)::float AS average,
    -- (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C 
    (  (COUNT(vote)::float / (COUNT(vote)+10)) * AVG(vote)::float + (10::float / (COUNT(vote)+10)) * 4 ) AS bayesan,
    RANK() OVER (PARTITION BY s.lang  ORDER BY(  (COUNT(vote)::float / (COUNT(vote)+10)) * AVG(vote)::float + (10::float / (COUNT(vote)+10)) * 4 ) DESC)
FROM votes AS v
LEFT OUTER JOIN supergroups_ref AS s_ref
ON s_ref.group_id = v.group_id
LEFT OUTER JOIN supergroups AS s
ON s.group_id = v.group_id
GROUP BY v.group_id, s_ref.title, s_ref.username, s.nsfw, s.banned_until, s.lang, s.category, s.bot_inside
HAVING 
    (s.banned_until IS NULL OR s.banned_until < now()) 
    AND COUNT(vote) >= 10 
    AND s.bot_inside IS TRUE;

but i miss only a thing (i hope): that 4 in the bayesan formula in the query is just a fake value because reading this i haven't understood what is mean with 'C'. can you please tell me what i should add to c?


Solution

  • After Google'ing Bayesian Estimate, going to the Wikipedia page for it, reading a little, and going to the section called "Practical example of Bayes estimators", it states

    where W is the weighted rating and C is the average rating of all films

    So that most likely translates for you to mean it is an average vote/rating regardless of group.