Search code examples
redisamazon-elasticacheredis-cache

How key/value pairs are stored in Redis


We use Elasticache Redis node to store data.

All keys have same format:
- Key is md5 hash - 128 bits (16 bytes, 32 string characters bytes).
- Value is time stamp string - 19 bytes.
In total, key size is 32+19=51 bytes

We have 84 917 361 million of keys.
I assume, total memory, that Redis shell consume, is close to 84917361*51 = 4.03 gb.

Actually, it takes 11.07 gb.
Output of info command: used_memory_human:11.07G

  1. On what is spent rest of the memory, 7 gb?
  2. Is there a way to store md5 as 16 bytes hash, not string with 32 characters?

Thanks, any help is highly appreciated.


Solution

  • On what is spent rest of the memory, 7 gb?

    Short answer: Redis does NOT store key and value as raw string.

    In fact,

    1. the key is wrapped into a sdshdr structure (for the latest version, it's a more compact structure), which has some overhead, e.g. the length of the string.

    2. the value is wrapped into a redisObject structure, which also has some overhead, e.g. object encoding, refcount.

    3. There're also other overhead when Redis insert a pair into a dict, e.g. the next pointer and key pointer in a dictEntry structure.

    All of these overhead consumes the rest of memory.

    In order to make it's more memory efficient, you can refer to the article that @Kevin Christopher Henry mentioned (small hash can save lots of redisObject overhead, and can use ziplist to make elements more compact in memory).

    Is there a way to store md5 as 16 bytes hash, not string with 32 characters?

    Use a hash function, such as Murmurhash, to create a digest for each md5 string.

    In this way, you can get a 8 bytes (64 bits) digest. However, you cannot get the raw md5 string from the Murmurhash digest. So if you DO NOT care the value of md5, you can take this method.