I am running the Tornado web server in conjunction with Mongodb (using the pymongo driver). I am trying to make architectural decisions to maximize performance.
I have several subquestions regarding the blocking/non-blocking and asynchronous aspects of the resulting application when using Tornado and pymongo together:
It appears that the pymongo.mongo_client.MongoClient
object automatically implements a pool of connections. Is the intended purpose of a "connection pool" so that I can access mongodb simultaneously from different threads? Is it true that if run with a single MongoClient instance from a single thread that there is really no "pool" since there would only be one connection open at any time?
The following FAQ:
states:
Currently there is no great way to use PyMongo in conjunction with Tornado or Twisted. PyMongo provides built-in connection pooling, so some of the benefits of those frameworks can be achieved just by writing multi-threaded code that shares a MongoClient.
So I assume that I just pass a single MongoClient
reference to each thread? Or is there more to it than that? What is the best way to trigger a callback when each thread produces a result? Should I have one thread running who's job it is to watch a queue (python's Queue.Queue
) to handle each result and then calling finish()
on the left open RequestHandler
object in Tornado? (of course using the tornado.web.asynchronous
decorator would be needed)
Finally, is it possible that I am just creating work? Should I just shortcut things by running a single threaded instance of Tornado and then start 3-4 instances per core? (The above FAQ reference seems to suggest this)
After all doesn't the GIL in python result in effectively different processes anyway? Or are there additional performance considerations (plus or minus) by the "non-blocking" aspects of Tornado? (I know that this is non-blocking in terms of I/O as pointed out here: Is Tornado really non-blocking?)
(Additional Note: I am aware of asyncmongo at: https://github.com/bitly/asyncmongo but want to use pymongo directly and not introduce this additional dependency.)
As i understand, there is two concepts of webservers:
And you've the GIL with python, GIL is not good with threads, and event driven is a model that uses only one thread, so go with event driven.
Pymongo will block tornado, so here is suggestions:
gte
And now, if you decide other solution than Tornado, if you use Gevent, then you can use Pymongo, because it is said:
The only async framework that PyMongo fully supports is Gevent.
NB: sorry if going out of topic, but the sentence:
Currently there is no great way to use PyMongo in conjunction with Tornado
should be dropped from the documentation, Mongotor and Motor works in a perfect manner (Motor in particular).