Search code examples
redisspring-data-redis

How to get same rank for same scores in Redis' ZRANK?


If I have 5 members with scores as follows

a - 1
b - 2
c - 3
d - 3
e - 5

ZRANK of c returns 2, ZRANK of d returns 3

Is there a way to get same rank for same scores?
Example: ZRANK c = 2, d = 2, e = 3
If yes, then how to implement that in spring-data-redis?


Solution

  • Any real solution needs to fit the requirements, which are kind of missing in the original question. My 1st answer had assumed a small dataset, but this approach does not scale as dense ranking is done (e.g. via Lua) in O(N) at least.

    So, assuming that there are a lot of users with scores, the direction that for_stack suggested is better, in which multiple data structures are combined. I believe this is the gist of his last remark.

    To store users' scores you can use a Hash. While conceptually you can use a single key to store a Hash of all users scores, in practice you'd want to hash the Hash so it will scale. To keep this example simple, I'll ignore Hash scaling.

    This is how you'd add (update) a user's score in Lua:

    local hscores_key = KEYS[1]
    local user = ARGV[1]
    local increment = ARGV[2]
    local new_score = redis.call('HINCRBY', hscores_key, user, increment)
    

    Next, we want to track the current count of users per discrete score value so we keep another hash for that:

    local old_score = new_score - increment
    local hcounts_key = KEYS[2]
    local old_count = redis.call('HINCRBY', hcounts_key, old_score, -1)
    local new_count = redis.call('HINCRBY', hcounts_key, new_score, 1)
    

    Now, the last thing we need to maintain is the per score rank, with a sorted set. Every new score is added as a member in the zset, and scores that have no more users are removed:

    local zdranks_key = KEYS[3]
    if new_count == 1 then
      redis.call('ZADD', zdranks_key, new_score, new_score)
    end
    if old_count == 0 then
      redis.call('ZREM', zdranks_key, old_score)
    end
    

    This 3-piece-script's complexity is O(logN) due to the use of the Sorted Set, but note that N is the number of discrete score values, not the users in the system. Getting a user's dense ranking is done via another, shorter and simpler script:

    local hscores_key = KEYS[1]
    local zdranks_key = KEYS[2]
    local user = ARGV[1]
    
    local score = redis.call('HGET', hscores_key, user)
    return redis.call('ZRANK', zdranks_key, score)