Search code examples
rediszipmap

Why does Redis Hash Bucket saves disk?


I read this blog: http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs

Instagram guys do a very great job and elaborate how to save disk. However, I want some detailed reason why the zipmap hash bucket could save memory? Is it because you don't need to allocate a lot of long type but only need int type instead?

Thanks you guys.


Solution

  • The main thing to understand here is that pointers take up a lot of space. If you were to serialize a hash and keep it as a string with no key value pointers, you would save a ton of space because you'd be going from a pointer for each pair to 0 pointers.

    Redis is an in-memory datastore and wants to help you save as much space as possible as long as it can do this without seriously affecting performance. To accomplish that, it will keep small hashes simply serialized and search through them completely any time a hash operation is performed. In reality this is O(n), but since the hash is small, it doesn't affect performance while saving a lot of memory. Once the hash gets large, Redis will convert it to an actual hash and it will start taking up much more space, but now have regular hash O(1) seek, write, delete time. Redis provides the "hash-max-zipmap-entries" setting among many similar settings to let you configure exactly where this conversion point is. So, what the Instagram engineers figured out is they could set this conversion point higher than the default to save more space at the cost of higher CPU load. For them, this was a good trade off. I highly recommend reading here for more information.