I am building a testing server that loads up a huge pickle file (took about 30s) when an endpoint is hit. My goal is to update it to load the pickle as a python object into memory in the background when the tornado web server boots up as a separate thread. So when the endpoint is hit, it either finds it in the memory or it waits until the thread has completed the loading. That way will make the boot-up much faster.
I am here seeking some recommendation on what's the best way to add async to make this operation working.
my_server.py
import tornado.ioloop
import tornado.web
from my_class import MyClass
class MainHandler(tornado.web.RequestHandler):
def get(self):
m = MyClass.get_foobar_object_by_name('foobar')
self.write("Hello, world")
def make_app():
return tornado.web.Application([
(r"/", MainHandler),
])
if __name__ == "__main__":
app = make_app()
app.listen(8888)
MyClass.load() # takes 30s to load
tornado.ioloop.IOLoop.current().start()
my_class.py
class MyClass(object):
pickle_path = '/opt/some/path/big_file.pickle'
foobar_map = None
@staticmethod
def load():
# this step takes about 30s to load
MyClass.foobar_map = pickle.load(open(local_path, 'rb'))
@staticmethod
def get_foobar_object_by_name(foobar_name):
if MyClass.foobar_map is None:
MyClass.load()
return MyClass.foobar_map.get(foobar_name)
The pickle
module has a synchronous interface, so the only way to run it asynchronously is to run it on another thread. Using the new IOLoop.run_in_executor
interface in Tornado 5.0:
from tornado.ioloop import IOLoop
from tornado.web import RequestHandler
from tornado.locks import Lock
class MyClass:
lock = Lock()
@staticmethod
async def load():
async with MyClass.lock():
# Check again inside the lock to make sure we only do this once.
if MyClass.foobar_map is None:
MyClass.foobar_map = await IOLoop.current().run_in_executor(None, pickle.load, open(local_path, 'rb'))
@staticmethod
async def get_foobar_object_by_name(foobar_name):
if MyClass.foobar_map is None:
await MyClass.load()
return MyClass.foobar_map.get(foobar_name)
class MainHandler(RequestHandler):
async def get(self):
m = await MyClass.get_foobar_object_by_name('foobar')
self.write("Hello, world")
Note that async
is contagious: anything that calls an async
function also needs to be async
and use await
.