We want to make an e-commerce application, and the team are python devs, but not using python web frameworks (Django/Flask...), and because we found that Tornado was excellent by its simplicity, we gave him a big percentage.
But the problem is that, Tornado is single-threaded, and the application will use hashing (login), and image processing (thumbnails generation). Can ThreadPoolExecutor
play the role of a multithreading server like Apache, as in this example?
from concurrent.futures import ThreadPoolExecutor
from tornado import gen
from tornado.process import cpu_count
import bcrypt
pool = ThreadPoolExecutor(cpu_count())
@gen.coroutine
def create_user(name, password):
hashed_pw = yield pool.submit(bcrypt.hashpw, password, bcrypt.gensalt())
yield save_user(name, hashed_pw)
@gen.coroutine
def login(name, password):
user = yield load_user(name)
match = yield pool.submit(bcrypt.checkpw, password, user.hashed_pw)
if not match:
raise IncorrectPasswordError()
So, Tornado sends the hashing work to another thread, to free himself and be able to receive other requests. Will this approach work?
NB: There is also a solution involving a load balancer, but the team doesn't want to pursue this solution right now.
Yes, ThreadPoolExecutor
will work well here. It appears both hashpw
and checkpw
release the GIL during the CPU-heavy parts of their operation:
bcrypt_hashpw(PyObject *self, PyObject *args, PyObject *kw_args)
{
...
Py_BEGIN_ALLOW_THREADS;
ret = pybc_bcrypt(password_copy, salt_copy, hashed, sizeof(hashed));
Py_END_ALLOW_THREADS;
...
That means you'll be able to farm that work off to one CPU, while handling incoming requests with another CPU.
Just keep in mind that if you need to do some other CPU-bound operations that run pure-Python (meaning the GIL doesn't get released), you'll need to use a ProcessPoolExecutor
to avoid taking a performance hit.