Search code examples
postgresqlrediskey-value-store

Get Redis values while scanning


I've created a Redis key / value index this way :

set 7:12:321 '{"some:"JSON"}'

The key is delimited by a colon separator, each part of the key represents a hierarchic index. get 7:12:321 means that I know the exact hierarchy and want only one single item

scan 7:12:* means that I want every item under id 7 in the first level of hierarchy and id 12 in the second layer of hierarchy.

Problem is : if I want the JSON values, I have to first scan (~50000 entries in a few ms) then get every key returned by scan one by one (800ms).

This is not very efficient. And this is the only answer I found on stackoverflow searching "scanning Redis values".

1/ Is there another way of scanning Redis to get values or key / value pairs and not only keys ? I tried hscan with as follows :

hset myindex 7:12:321 '{"some:"JSON"}' hscan myindex MATCH 7:12:*   But it destroys the performance (almost 4s for the 50000 entries)

2/ Is there another data structure in Redis I could use in the same way but which could "scan for values" (hset ?)

3/ Should I go with another data storage solution (PostgreSQL ltree for instance ?) to suit my use case with huge performance ?

I must be missing something really obvious, 'cause this sounds like a common use case.

Thanks for your answers.


Solution

  • Optimization for your current solution

    Instead of geting every key returned by scan one-by-one, you should use mget to batch get key-value pairs, or use pipeline to reduce RTT.

    Efficiency problem of your current solution

    scan command iterates all keys in the database, even if the number of keys that match the pattern is small. The performance decreases when the number of keys increases.

    Another solution

    Since the hierarchic index is an integer, you can encode the hierarchic indexes into a number, and use that number as the score of a sorted set. In this way, instead of searching by pattern, you can search by score range, which is very fast with a sorted set. Take the following as an example.

    Say, the first (right-most) hierarchic index is less than 1000, the second index is less than 100, then you can encode the index (e.g. 7:12:321) into a score (321 + 12 * 1000 + 7 * 100 * 1000 = 712321). Then set the score and the value into a sorted set: zadd myindex 712321 '{"some:"JSON"}'.

    When you want to search keys that match 7:12:*, just use zrangebyscore command to get data with a score between 712000 and 712999: zrangebyscore myindex 712000 712999 withscores.

    In this way, you can get key (decoded with the returned score) and value together. Also it should be faster than the scan solution.

    UPDATE

    The solution has a little problem: members of sorted set must be unique, so you cannot have 2 keys with the same value (i.e. json string).

    // insert OK
    zadd myindex 712321 '{"the_same":"JSON"}'
    // failed to insert, members should be unique
    zadd myindex 712322 '{"the_same":"JSON"}'
    

    In order to solve this problem, you can combine the key with the json string to make it unique:

    zadd myindex 712321 '7:12:321-{"the_same":"JSON"}'
    zadd myindex 712321 '7:12:322-{"the_same":"JSON"}'