Search code examples
pythonmultithreadingflaskconcurrent.futures

Will I get performance boost combining concurrent.futures with Flask


I'm wondering if it's OK to use concurrent.futures with Flask. Here's an example.

import requests
from flask import Flask
from concurrent.futures import ThreadPoolExecutor

executor = ThreadPoolExecutor(max_workers=10)
app = Flask(__name__)

@app.route("/path/<xxx>")
def hello(xxx):
    f = executor.submit(task, xxx)
    return "OK"

def task():
    resp = requests.get("some_url")
    # save to mongodb

app.run()

The task is IO-bound and return value is not needed. Requests won't come frequently, I guess 10/s at most.

I tested it and it worked. What I want to know is whether I can get performance boost using multithreading this way. Will Flask block the task in some way?


Solution

  • This is dependent on more factors than Flask, like what you are using in front of Flask (gunicorn, gevent, uwsgi, nginx, etc). If you find that your request to "some_url" is indeed a bottleneck, pushing it to another thread might provide a boost, but again that depends on your individual circumstances; many elements in a web stack can make the process "slow".

    Instead of multithreading on the Flask process (which can quickly get complicated), pushing blocking I/O to a helper process might be a better solution. You can send Redis messages to a process running on an asyncio event loop, which will scale well.

    app.py

    from flask import Flask
    import redis
    
    r = redis.StrictRedis(host='127.0.0.1', port=6379)
    app = Flask(__name__)
    
    @app.route("/")
    def hello():
        # send your message to the other process with redis
        r.publish('some-channel', 'some data')
        return "OK"
    
    if __name__ == '__main__':
        app.run(port=4000, debug=True)
    

    helper.py

    import asyncio
    import asyncio_redis
    import aiohttp
    
    @asyncio.coroutine
    def get_page():
        # get some url
        req = yield from aiohttp.get('http://example.com')
        data = yield from req.read()
    
        # insert into mongo using Motor or some other async DBAPI
        #yield from insert_into_database(data) 
    
    @asyncio.coroutine
    def run():
        # Create connection
        connection = yield from asyncio_redis.Connection.create(host='127.0.0.1', port=6379)
    
        # Create subscriber.
        subscriber = yield from connection.start_subscribe()
    
        # Subscribe to channel.
        yield from subscriber.subscribe([ 'some-channel' ])
    
        # Inside a while loop, wait for incoming events.
        while True:
            reply = yield from subscriber.next_published()
            print('Received: ', repr(reply.value), 'on channel', reply.channel)
            yield from get_page()
    
        # When finished, close the connection.
        connection.close()
    
    if __name__ == '__main__':
        loop = asyncio.get_event_loop()
        loop.run_until_complete(run())