today i posted a question if bayesan was a good method in my case to do a rating system weight. since i got no answers i tried to implement if my self.
in my database users can add a vote from 1 to 5 stars to every groups. then i have to display a leaderboard by those votes. what i was doing until now is to order them by votes average without a weight. this is not so nice because a group having 5.0 with 20 votes is before of a group having 4.9 avg and 10000 votes.
this is my table:
CREATE TABLE IF NOT EXISTS votes(
user_id BIGINT,
group_id BIGINT,
vote SMALLINT,
vote_date timestamp,
PRIMARY KEY (user_id, group_id)
this is a query i tried to make and it seems to work:
SELECT
v.group_id,
s_ref.title,
s_ref.username,
COUNT(vote) AS amount,
ROUND(AVG(vote), 1)::float AS average,
-- (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C
( (COUNT(vote)::float / (COUNT(vote)+10)) * AVG(vote)::float + (10::float / (COUNT(vote)+10)) * 4 ) AS bayesan,
RANK() OVER (PARTITION BY s.lang ORDER BY( (COUNT(vote)::float / (COUNT(vote)+10)) * AVG(vote)::float + (10::float / (COUNT(vote)+10)) * 4 ) DESC)
FROM votes AS v
LEFT OUTER JOIN supergroups_ref AS s_ref
ON s_ref.group_id = v.group_id
LEFT OUTER JOIN supergroups AS s
ON s.group_id = v.group_id
GROUP BY v.group_id, s_ref.title, s_ref.username, s.nsfw, s.banned_until, s.lang, s.category, s.bot_inside
HAVING
(s.banned_until IS NULL OR s.banned_until < now())
AND COUNT(vote) >= 10
AND s.bot_inside IS TRUE;
but i miss only a thing (i hope): that 4 in the bayesan formula in the query is just a fake value because reading this i haven't understood what is mean with 'C'. can you please tell me what i should add to c?
After Google'ing Bayesian Estimate, going to the Wikipedia page for it, reading a little, and going to the section called "Practical example of Bayes estimators", it states
where W is the weighted rating and C is the average rating of all films
So that most likely translates for you to mean it is an average vote/rating regardless of group.