Search code examples
redisdata-storagehigh-load

Storing huge randomized list of similar items in Redis


Redis 2.0.3

I need to store a huge list of items in Redis. Each item is a short string (less than 256 characters).

I need to do two operations on the list:

  • Add many (several thousands to a million) identical items. (Several times a day)

  • Remove one random item from the list. It is not necessary to have "fair" random. Any "good enough" approach will do. (Up to several hundred times a second)

I do not have enough RAM to store all items in a list one by one.

I think that I need to store items in batches, name and a counter. (There will be up to several thousands distinct items, more like several hundreds.)

But I'm not sure how to organize this effectively.

Any hints?


Solution

  • Well, since nobody is up to help me, here is a "dumb" solution, in pseudocode.

    1. Get random element:

      function maybe_get_next_item()
        item_name = SRANDMEMBER "items-set"
        item_key = "items:" + item_name
      
        new_item_count = DECR (item_key)
      
        if new_item_count < 0 then
          LOCK -- As explained in SETNX docs
            new_item_count = GET (item_key) -- More added while we were locking?
            if new_item_count and new_item_count < 0 then
              SREM (item_name) -- No, expire it
            end
          UNLOCK
        end
      
        if new_item_count and new_item_count >= 0 then
          return item_name
        end
      
        return false -- this item not found
      end
      
      function get_next_item()
        item_name = maybe_get_next_item()
        while not item_name and (SCARD "items-set" > 0) do
          item_name = maybe_get_next_item()
        end
        return item_name -- false if all items are expended
      end
      
    2. Insert new elements

      function insert_items(item_name, amount)
        LOCK -- As explained in SETNX docs
          SADD "items-set" (item_name)
          INCRBY ("items:" + item_name) amount
        UNLOCK
      end
      

    Please do suggest a better solution if it exists, I'm still groking Redis, and may miss something obvious.

    I suspect that LOCK/UNLOCK in insert_items() may be superfluous and can be replaced with MULTI/EXEC, but I think that it is needed for LOCK/UNLOCK in maybe_get_next_item() to work properly (which I do not know how to replace with MULTI/EXEC)...