Why does Tornado's WSGI support block for more than one request?

Imagine the following tornado app:

import logging
import time

from tornado import gen, httpserver, ioloop, web, wsgi


def simple_app(environ, start_response):
    time.sleep(1)
    status = "200 OK"
    response_headers = [("Content-type", "text/plain")]
    start_response(status, response_headers)
    return [b"Hello, WSGI world!\n"]

class HelloHandler(web.RequestHandler):
    @gen.coroutine
    def get(self):
        yield gen.moment
        self.write('Hello from tornado\n')
        self.finish()

def main():
    wsgi_app = wsgi.WSGIContainer(simple_app)
    tornado_app = web.Application(
        [
            ('/tornado', HelloHandler),
            ('.*', web.FallbackHandler, dict(fallback=wsgi_app)),
        ],
        debug=True,
    )
    http_server = httpserver.HTTPServer(tornado_app)
    http_server.listen(8888)
    current_loop = ioloop.IOLoop.current()
    current_loop.start()


if __name__ == '__main__':
    main()

Now if you run this and try to get http://localhost:8888/ tornado blocks until the WSGI request is finished (here after one second sleep). This thing I know. But if you send many request one after another then the IOLoop blocks likely forever.

I tried a benchmark like this:

$ ab -n 20 -c 2 localhost:8888

In a second terminal I tried to get the other url:

$ curl http://localhost:8888/tornado

I got the response for the non WSGI request not until all other concurrent WSGI requests are finished. This only works if yield gen.moment is removed.

Can anybody explain what is going on here and how I can prevent Tornado from blocking all my requests and not only one of them?

Solution

Tornado's WSGIContainer is not designed for high-traffic use. See the warning in its docs:

WSGI is a synchronous interface, while Tornado’s concurrency model is based on single-threaded asynchronous execution. This means that running a WSGI app with Tornado’s WSGIContainer is less scalable than running the same app in a multi-threaded WSGI server like gunicorn or uwsgi. Use WSGIContainer only when there are benefits to combining Tornado and WSGI in the same process that outweigh the reduced scalability.

In general it's best to run WSGI and Tornado applications in separate processes so the WSGI parts can have a server that is designed for WSGI. WSGIContainer should only be used when there are specific reasons to combine them in the same process, and then it should be used with care to avoid blocking the event loop for too long. It's best to do as much as possible in Tornado-native RequestHandlers so coroutines can be used instead of blocking.