Search code examples
pythonlinuxmultithreadingasynchronoustornado

Logging in an asynchronous Tornado (python) server


I am working on an application in which I may potentially need to log the entire traffic reaching the server. This feature may be turned on or off, or may be used when exceptions are caught.

In any case, I am concerned about the blocking nature of disk I/O operations and their impact on the performance of the server. The business logic that is applied when a request is handled (mostly POST http requests), is asynchronous in such that every network or db calls are asynchronously executed.

On the other hand, I am concerned about the delay to the thread while it is waiting for the disk IO operation to complete. The logged messages can be a few bytes to a few KBs but in some cases a few MBs. There is no real need for the thread to pause while data is written to disk, the http request can definitely complete at that point and there is no reason that the ioloop thread not to work on another task while data is written to disk.

So my questions are:

  1. am I over-worried about this issue? is logging to standard output and later redirecting it to a file "good enough"?
  2. what is the common approach, or the one you found most practical for logging in tornado-based applications? even for simple logging and not the (extreme) case I outlined above?
  3. is this basically an ideal case for queuing the logging messages and consume them from a dedicated thread?
  4. Say I do offload the logging to a different thread (like Homer Simpson's "Can't Someone Else Do It?"), if the thread that performs the disk logging is waiting for the disk io operation to complete, does the linux kernel takes that point as an opportunity a context switch?

Any comments or suggestion are much appreciated,

Erez


Solution

  • For "normal" logging (a few lines per request), I've always found logging directly to a file to be good enough. That may not be true if you're logging all the traffic to the server. The one time I've needed to do something like that I just captured the traffic externally with tcpdump instead of modifying my server.

    If you want to capture it in the process, start by just writing to a file from the main thread. As always, measure things in your own environment before taking drastic action (IOLoop.set_blocking_log_threshold is useful for determining if your logging is a problem).

    If writing from the main thread blocks for too long, you can either write to a queue that is processed by another thread, or write asynchronously to a pipe or socket to another process (syslog?).