Search code examples
pythonamazon-web-servicesredisuwsgiredis-py

AWS Redis + uWSGI behind NGINX - high load


I'm running a python application (flask + redis-py) with uwsgi + nginx and using aws elasticache (redis 2.8.24).

while trying to improve my application response time, I've noticed that under high load (500 request per second/for 30 seconds using loader.io) I'm losing requests (for this test i'm using just a single server without load balancer, 1 uwsgi instance, 4 processes, on purpose for testing). stress test

I've dug a little deeper and found out that under this load, some requests to ElastiCache are slow. for example:

  • normal load: cache_set time 0.000654935836792
  • heavy load: cache_set time 0.0122258663177 this does not happen for all requests, just randomly occurres..

My AWS ElastiCache is based on 2 nodes on cache.m4.xlarge (default AWS configuration settings). See current clients connected in the last 3 hours: aws elasticache clients

I think this doesn't make sense as currently 14 servers (8 of them with high traffic of XX RPS use this cluster), I would expect to see a much higher client rate.

uWSGI config (Version 2.0.5.1)

processes = 4
enable-threads = true
threads = 20
vacuum = true
die-on-term = true
harakiri = 10
max-requests = 5000
thread-stacksize = 2048
thunder-lock = true
max-fd = 150000
# currently disabled for testing
#cheaper-algo = spare2
#cheaper = 2
#cheaper-initial = 2
#workers = 4
#cheaper-step = 1

Nginx is just a web proxy to uWSGI using unix socket.

This is how I open a connection to redis:

rdb = [
    redis.StrictRedis(host='server-endpoint', port=6379, db=0),
    redis.StrictRedis(host='server-endpoint', port=6379, db=1)
]

This is how I set a value for example:

def cache_set(key, subkey, val, db, cache_timeout=DEFAULT_TIMEOUT):
    t = time.time()
    merged_key = key + ':' + subkey
    res = rdb[db].set(merged_key, val, cache_timeout)
    print 'cache_set time ' + str(time.time() - t)
    return res

cache_set('prefix', 'key_name', 'my glorious value', 0, 20)

This is how I get a value:

def cache_get(key, subkey, db, _eval=False):
    t = time.time()
    merged_key = key + ':' + subkey
    val = rdb[db].get(merged_key)
    if _eval:
        if val:
            val = eval(val)
        else:  # None
            val = 0
    print 'cache_get time ' + str(time.time() - t)
    return val

cache_get('prefix', 'key_name', 0)

Version:

  • uWSGI: 2.0.5.1
  • Flask: 0.11.1
  • redis-py: 2.10.5
  • Redis: 2.8.24

So the conclude:

  1. Why AWS clients count is low if 14 servers are connected, each with 4 processes, and each of them opens a connection to 8 different database within the redis cluster
  2. What causes the requests response time to climb?
  3. Would appreciate any advise regarding ElastiCache and/or uWSGI performance under heavy load

Solution

  • Short Answer

    So, if I got it right, in my case the problem was not Elasticache requests but uWSGI memory usage.

    Long Answer

    I've installed uwsgitop with this setup:

    ### Stats
    ### ---
    ### disabled by default
    ### To see stats run: uwsgitop /tmp/uwsgi_stats.socket
    ### uwsgitop must be install (pip install uwsgitop)
    stats = /tmp/uwsgi_stats.socket
    

    this will expose uwsgi stats to uwsgitop.

    I then used loader.io to stress test the application with 350-500 requests/second.

    What I discovered with my previous configuration is that uWSGI workers kept growing in used memory size until the memory choked and then the cpu spiked. new workers that needed to re-spawn also demanded cpu which cause some sort of overload on the servers - which caused nginx to timeout and close those connections.

    So I did some research and config modification till I managed to get the setup below, that currently manages ~650rps on each instance with ~13ms response time which is great for me.

    * My application used (still uses some) disk pickled dat files, some of them were heavy to load - i've reduced disk dependency to minimum *

    For anyone who might see it in the future - if you need fast responses - asynchronize everything you can. for ex, use celery+rabbitmq for any databases requests if possible

    uWSGI Configuration:

    listen = 128
    processes = 8
    threads = 2
    max-requests = 10000
    reload-on-as = 4095
    reload-mercy = 5
    #reload-on-rss = 1024
    limit-as = 8192
    cpu-affinity = 3
    thread-stacksize = 1024
    max-fd = 250000
    buffer-size = 30000
    thunder-lock = true
    vacuum = true
    enable-threads = true
    no-orphans = true
    die-on-term = true
    

    NGINX relevant parts:

    user nginx;
    worker_processes 4;
    worker_rlimit_nofile 20000;
    thread_pool my_threads threads=16;
    pid /run/nginx.pid;
    
    events {
        accept_mutex off;
        # determines how much clients will be served per worker
        # max clients = worker_connections * worker_processes
        # max clients is also limited by the number of socket connections available on the system (~64k)
        worker_connections 19000;
    
        # optmized to serve many clients with each thread, essential for linux -- for testing environment
        use epoll;
    
        # accept as many connections as possible, may flood worker connections if set too low -- for testing environment
        multi_accept on;
    }
    
    http {
        ...
        aio                     threads;
        sendfile                on;
        sendfile_max_chunk      512k;
        tcp_nopush              on;
        tcp_nodelay             on;
        keepalive_timeout       5 5;
        keepalive_requests      0;
        types_hash_max_size     2048;
        send_timeout            15;
        ...
    }
    

    Hope it helps!