Search code examples
pythonpython-multiprocessing

Does python logging support multiprocessing?


I have been told that logging can not be used in Multiprocessing. You have to do the concurrency control in case multiprocessing messes the log.

But I did some test, it seems like there is no problem using logging in multiprocessing

import time
import logging
from multiprocessing import Process, current_process, pool


# setup log
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                    datefmt='%a, %d %b %Y %H:%M:%S',
                    filename='/tmp/test.log',
                    filemode='w')


def func(the_time, logger):
    proc = current_process()
    while True:
        if time.time() >= the_time:
            logger.info('proc name %s id %s' % (proc.name, proc.pid))
            return



if __name__ == '__main__':

    the_time = time.time() + 5

    for x in xrange(1, 10):
        proc = Process(target=func, name=x, args=(the_time, logger))
        proc.start()

As you can see from the code.

I deliberately let the subprocess write log at the same moment( 5s after start) to increase the chance of conflict. But there are no conflict at all.

So my question is can we use logging in multiprocessing? Why so many posts say we can not ?


Solution

  • As Matino correctly explained: logging in a multiprocessing setup is not safe, as multiple processes (who do not know anything about the other ones existing) are writing into the same file, potentially intervening with each other.

    Now what happens is that every process holds an open file handle and does an "append write" into that file. The question is under what circumstances the append write is "atomic" (that is, cannot be interrupted by e.g. another process writing to the same file and intermingling his output). This problem applies to every programming language, as in the end they'll do a syscall to the kernel. This answer answers under which circumstances a shared log file is ok.

    It comes down to checking your pipe buffer size, on linux that is defined in /usr/include/linux/limits.h and is 4096 bytes. For other OSes you find here a good list.

    That means: If your log line is less than 4'096 bytes (if on Linux), then the append is safe, if the disk is directly attached (i.e. no network in between). But for more details please check the first link in my answer. To test this you can do logger.info('proc name %s id %s %s' % (proc.name, proc.pid, str(proc.name)*5000)) with different lenghts. With 5000 for instance I got already mixed up log lines in /tmp/test.log.

    In this question there are already quite a few solutions to this, so I won't add my own solution here.

    Update: Flask and multiprocessing

    Web frameworks like flask will be run in multiple workers if hosted by uwsgi or nginx. In that case, multiple processes may write into one log file. Will it have problems?

    The error handling in flask is done via stdout/stderr which is then cought by the webserver (uwsgi, nginx, etc.) which needs to take care that logs are written in correct fashion (see e.g. this flask+nginx example), probably also adding process information so you can associate error lines to processes. From flasks doc:

    By default as of Flask 0.11, errors are logged to your webserver’s log automatically. Warnings however are not.

    So you'd still have this issue of intermingled log files if you use warn and the message exceeds the pipe buffer size.