Search code examples
redisspring-data-redisamazon-elasticachelettuce

How can I stop redis memory usage increasing with connection count


We are running redis via elasticache on AWS and are seeing memory usage spike when running a large number of lambda functions which just read. Here is some example output from redis-cli --stat

------- data ------ --------------------- load -------------------- - child -
keys       mem      clients blocked requests            connections
1002       28.11M   15      0       2751795 (+11)       53877
1002       28.07M   15      0       2751797 (+2)        53877
1002       28.07M   15      0       2751799 (+2)        53877
1002       28.11M   15      0       2751803 (+4)        53877
1002       28.07M   15      0       2751806 (+3)        53877
1001       28.11M   15      0       2751808 (+2)        53877
1007       28.08M   15      0       2751837 (+29)       53877
1007       28.08M   15      0       2751839 (+2)        53877
1005       28.10M   16      0       2751841 (+2)        53878
1007       171.68M  94      0       2752012 (+171)      53957
1006       545.93M  316     0       2752683 (+671)      54179
1006       1.07G    483     0       2753508 (+825)      54346
1006       1.54G    677     0       2754251 (+743)      54540
1006       1.98G    882     0       2755024 (+773)      54745
1006       2.35G    1010    0       2755776 (+752)      54873
1005       2.78G    1014    0       2756548 (+772)      54877
1005       2.80G    1014    0       2756649 (+101)      54877
1004       2.79G    1014    0       2756652 (+3)        54877
1008       2.79G    1014    0       2756682 (+30)       54877
1007       2.79G    1014    0       2756685 (+3)        54877

As you can see the number of keys is pretty much constant but as the number of clients increases the memory usage ramps up to 2.8GB. Is this memory pattern expected and if so is there a way to mitigate it other than increasing the amount of RAM available to the process?

The lambda clients are written in Java using lettuce 5.2.1.RELEASE and spring-data-redis 2.2.1.RELEASE

Unless there is some additional redis interaction within spring-data-redis the client code is basically as follows

public <T> T get(final String label, final RedisTemplate<String, ?> redisTemplate) {
    final BoundHashOperations<String, String, T> cache = redisTemplate.boundHashOps(REDIS_KEY);
    return cache.get(label);
}

There are no usages of RedisTemplate#keys in my codebase, the only interaction with redis is via RedisTemplate#boundHashOps

Here is the output from redis-cli info memory before and after the spike:

Before

# Memory
used_memory:31558400
used_memory_human:30.10M
used_memory_rss:50384896
used_memory_rss_human:48.05M
used_memory_peak:6498905008
used_memory_peak_human:6.05G
used_memory_peak_perc:0.49%
used_memory_overhead:4593040
used_memory_startup:4203584
used_memory_dataset:26965360
used_memory_dataset_perc:98.58%
allocator_allocated:32930040
allocator_active:34332672
allocator_resident:50593792
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:5140907060
maxmemory_human:4.79G
maxmemory_policy:volatile-lru
allocator_frag_ratio:1.04
allocator_frag_bytes:1402632
allocator_rss_ratio:1.47
allocator_rss_bytes:16261120
rss_overhead_ratio:1.00
rss_overhead_bytes:-208896
mem_fragmentation_ratio:1.60
mem_fragmentation_bytes:18826560
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:269952
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0

After

# Memory
used_memory:4939687896
used_memory_human:4.60G
used_memory_rss:4754452480
used_memory_rss_human:4.43G
used_memory_peak:6498905008
used_memory_peak_human:6.05G
used_memory_peak_perc:76.01%
used_memory_overhead:4908463998
used_memory_startup:4203584
used_memory_dataset:31223898
used_memory_dataset_perc:0.63%
allocator_allocated:5017947040
allocator_active:5043314688
allocator_resident:5161398272
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:5140907060
maxmemory_human:4.79G
maxmemory_policy:volatile-lru
allocator_frag_ratio:1.01
allocator_frag_bytes:25367648
allocator_rss_ratio:1.02
allocator_rss_bytes:118083584
rss_overhead_ratio:0.92
rss_overhead_bytes:-406945792
mem_fragmentation_ratio:0.96
mem_fragmentation_bytes:-185235352
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:4904133550
mem_aof_buffer:0
mem_allocator:jemalloc-5.1.0
active_defrag_running:0
lazyfree_pending_objects:0

Solution

  • Having discussed this with AWS support the cause of this memory spike is that each of the 1000 lambda clients is filling up a read buffer with ~5mb of data as the data we are storing in redis is large serialized json objects.

    Their recommendations are to either:

    Add 2-3 replicas in the cluster and use replica nodes for read requests. You can use reader endpoints for load balancing the requests

    Or control client output buffers with the parameters but note the clients will be disconnect if they reach the buffer limit.

    • client-output-buffer-limit-normal-hard-limit >> If a client's output buffer reaches the specified number of bytes, the client will be disconnected. The default is zero (no hard limit). By default is 0 which means client can use as much memory they can.
    • client-output-buffer-limit-normal-soft-limit >> If a client's output buffer reaches the specified number of bytes, the client will be disconnected, but only if this condition persists for client-output-buffer-limit-normal-soft-seconds. The default is zero (no soft limit).
    • client-output-buffer-limit-normal-soft-seconds >> For Redis publish/subscribe clients: If a client's output buffer reaches the specified number of bytes, the client will be disconnected. Default value is 0

    Given these constraints and our usage profile we're actually going to switch to use S3 in this instance.