Search code examples
pythontornadojupyterpython-asynciodask

Running a Tornado Server within a Jupyter Notebook


Taking the standard Tornado demonstration and pushing the IOLoop into a background thread allows querying of the server within a single script. This is useful when the Tornado server is an interactive object (see Dask or similar).

import asyncio
import requests
import tornado.ioloop
import tornado.web

from concurrent.futures import ThreadPoolExecutor

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

def make_app():
    return tornado.web.Application([
        (r"/", MainHandler),
    ])

pool = ThreadPoolExecutor(max_workers=2)
loop = tornado.ioloop.IOLoop()

app = make_app()
app.listen(8888)
fut = pool.submit(loop.start)

print(requests.get("https://localhost:8888"))

The above works just fine in a standard python script (though it is missing safe shutdown). Jupyter notebook are optimal environment for these interactive Tornado server environments. However, when it comes to Jupyter this idea breaks down as there is already a active running loop:

>>> import asyncio
>>> asyncio.get_event_loop()
<_UnixSelectorEventLoop running=True closed=False debug=False>

This is seen when running the above script in a Jupyter notebook, both the server and the request client are trying to open a connection in the same thread and the code hangs. Building a new Asyncio loop and/or Tornado IOLoop does not seem to help and I suspect I am missing something in Jupyter itself.

The question: Is it possible to have a live Tornado server running in the background within a Jupyter notebook so that standard python requests or similar can connect to it from the primary thread? I would prefer to avoid Asyncio in the code presented to users if possible due to its relatively complexity for novice users.


Solution

  • Part 1: Let get nested tornado(s)

    To find the information you need you would have had to follow the following crumbtrails, start by looking at what is described in the release notes of IPython 7 It particular it will point you to more informations on the async and await sections in the documentation, and to this discussion, which suggest the use of nest_asyncio.

    The Crux is the following:

    • A) either you trick python into running two nested event loops. (what nest_asyncio does)
    • B) You schedule coroutines on already existing eventloop. (I'm not sure how to do that with tornado)

    I'm pretty sure you know all that, but I'm sure other reader will appreciate.

    There are unfortunately no ways to make it totally transparent to users – well unless you control the deployment like on a jupyterhub, and can add these lines to the IPython startups scripts that are automatically loaded. But I think the following is simple enough.

    import nest_asyncio
    nest_asyncio.apply()
    
    
    # rest of your tornado setup and start code.
    

    Part 2: Gotcha Synchronous code block eventloop.

    Previous section takes only care of being able to run the tornado app. But note that any synchronous code will block the eventloop; thus when running print(requests.get("http://localhost:8000")) the server will appear to not work as you are blocking the eventloop, which will restart only when the code finish execution which is waiting for the eventloop to restart...(understanding this is an exercise left to the reader). You need to either issue print(requests.get("http://localhost:8000")) from another kernel, or, use aiohttp.

    Here is how to use aiohttp in a similar way as requests.

    import aiohttp
    session =  aiohttp.ClientSession()
    await session.get('http://localhost:8889')
    

    In this case as aiohttp is non-blocking things will appear to work properly. You here can see some extra IPython magic where we autodetect async code and run it on the current eventloop.

    A cool exercise could be to run a request.get in a loop in another kernel, and run sleep(5) in the kernel where tornado is running, and see that we stop processing requests...

    Part 3: Disclaimer and other routes:

    This is quite tricky and I would advise to not use in production, and warn your users this is not the recommended way of doing things.

    That does not completely solve your case, you will need to run things not in the main thread which I'm not sure is possible.

    You can also try to play with other loop runners like trio and curio; they might allow you to do stuff you can't with asyncio by default like nesting, but here be dragoons. I highly recommend trio and the multiple blog posts around its creation, especially if you are teaching async.

    Enjoy, hope that helped, and please report bugs, as well as things that did work.