Search code examples
pythonmultithreadingtornadoevent-drivenconcurrent.futures

Can ThreadPoolExecutor help single-threaded application efficiency?


We want to make an e-commerce application, and the team are python devs, but not using python web frameworks (Django/Flask...), and because we found that Tornado was excellent by its simplicity, we gave him a big percentage.

But the problem is that, Tornado is single-threaded, and the application will use hashing (login), and image processing (thumbnails generation). Can ThreadPoolExecutor play the role of a multithreading server like Apache, as in this example?

from concurrent.futures import ThreadPoolExecutor
from tornado import gen
from tornado.process import cpu_count
import bcrypt


pool = ThreadPoolExecutor(cpu_count())

@gen.coroutine
def create_user(name, password):
    hashed_pw = yield pool.submit(bcrypt.hashpw, password, bcrypt.gensalt())
    yield save_user(name, hashed_pw)

@gen.coroutine
def login(name, password):
    user = yield load_user(name)
    match = yield pool.submit(bcrypt.checkpw, password, user.hashed_pw)
    if not match:
        raise IncorrectPasswordError()

So, Tornado sends the hashing work to another thread, to free himself and be able to receive other requests. Will this approach work?

NB: There is also a solution involving a load balancer, but the team doesn't want to pursue this solution right now.


Solution

  • Yes, ThreadPoolExecutor will work well here. It appears both hashpw and checkpw release the GIL during the CPU-heavy parts of their operation:

    bcrypt_hashpw(PyObject *self, PyObject *args, PyObject *kw_args)
    {
        ...
        Py_BEGIN_ALLOW_THREADS;
        ret = pybc_bcrypt(password_copy, salt_copy, hashed, sizeof(hashed));
        Py_END_ALLOW_THREADS;
        ...
    

    That means you'll be able to farm that work off to one CPU, while handling incoming requests with another CPU.

    Just keep in mind that if you need to do some other CPU-bound operations that run pure-Python (meaning the GIL doesn't get released), you'll need to use a ProcessPoolExecutor to avoid taking a performance hit.