Search code examples
mysqlsumrankrow-number

Weighted SUM with rank in MySQL


I've been trying to compute a sum for each user of a table when the sum is weighted by the rank of each row according to some order. It's in the context of a video game, where the top score is worth 100%, the second one is worth 95%, the third is worth (0.95 * 0.95) * 100% and so on.

The formula can be simply described as SUM(score * POW(0.95, score's position)), yet everything I tried seems to hit a MySQL limitation.

First this option doesn't work because it seems the ORDER BY pls.Score DESC is called after the SUM (SUM doesn't seem changed by changing the order).

SELECT p.PlayerId, p.Username, 
        (SELECT SUM(pls.Score * POW(0.95, @i := if(@c = p.PlayerId, @i + 1, if(@c := p.PlayerId, 0, 0))))
        FROM player_level_stats pls, (SELECT @i := -1, @c := '') params
        WHERE pls.PlayerId = p.PlayerId
        ORDER BY pls.Score DESC) * 10 AS Score
FROM player p
ORDER BY 3 DESC;

Also I have to use this variable mess because I cannot have ROW_NUMBER() OVER () inside a SUM().

Now OK, my next thing to try was to SUM in an outer query so I can sort before summing:

SELECT p.PlayerId, p.Username, 
        (SELECT SUM(t.Score)
        FROM (SELECT pls.Score * POW(0.95, @i := if(@c = p.PlayerId, @i + 1, if(@c := p.PlayerId, 0, 0)))
            FROM player_level_stats pls, (SELECT @i := -1, @c := '') params
            WHERE pls.PlayerId = p.PlayerId
            ORDER BY pls.Score DESC) AS t) * 10 AS Score
FROM player p
ORDER BY 3 DESC
LIMIT 50;

Sounds good in theory but in practice there's a MySQL limitation where you cannot pass a parameter to a nested subquery (answers about this basically tell you to try something else). As the nested subquery is in the FROM, it seems really difficult to pass something to the intermediate query.

Of course there will always be the slow SUM(pls.Score * POW(0.95, (SELECT COUNT(*) FROM player_level_stats ipls WHERE pls.Score > ipls.Score))) but this is O(number of scores ^ 2) instead of O(number of scores * log(number of scores)) making the whole thing much slower.

Is there any way to do this in MySQL? Alternatively is there any way to do this with MariaDB?

EDIT: Here's an even minimal version (that do not work either)

SELECT p.PlayerId, (SELECT SUM(pls.Score * (RANK() OVER (ORDER BY pls.Score DESC)))
        FROM player_level_stats pls
        WHERE pls.PlayerId = p.PlayerId
        ORDER BY pls.Score DESC) AS WeightedScore
FROM player p;

This simply multiplies the score of each player by the rank of that score (best score * 1 + second best score * 2 + ... n best score * n). This is essentially the same problem but simplified.


Solution

  • If I understood the problem correctly, what you can do is a subquery with the sums and then use row_number() over(order by) to get the rank. Something like this:

    SELECT playerID, SUM(score * weighted) AS totalscore
    FROM
    (
     SELECT 
      playerID, score, POW(0.95,(ROW_NUMBER() OVER(partition by playerID ORDER BY score desc)-1)) as weighted
      FROM player_level_stats
     ) AS totals
    GROUP BY playerID