Search code examples
pythonpython-loggingjoblib

python logging with joblib returns empty


When I try to log to a file from multithreaded loop, I just get empty file.

Minimal program to illustrate the issue is following:

import logging
import time
from joblib import Parallel, delayed

def worker(index):
    time.sleep(0.5)
    logging.info('Hi from myfunc {}'.format(index))
    time.sleep(0.5)

def main():
    logging.basicConfig(filename='multithread_test.log', level=logging.INFO, format='%(relativeCreated)6d %(threadName)s %(message)s')
    Parallel(n_jobs=4)(delayed(worker)(m) for m in range(1, 8))
    
if __name__ == '__main__':
    main()

But when I set, n_jobs=1 instead of 4, I got expected output:

  1636 MainThread Hi from myfunc 1
  2656 MainThread Hi from myfunc 2
  3676 MainThread Hi from myfunc 3
  4696 MainThread Hi from myfunc 4
  5716 MainThread Hi from myfunc 5
  6736 MainThread Hi from myfunc 6
  7756 MainThread Hi from myfunc 7

Solution

  • It's a known issue (workers cant't fetch logging state and params), which currently has several workarounds like

    • changing the parallel backend from loky to anything else
    • make helper function for logger initialization at each worker
    • use queried logging and so on

    Some details on the problem and possible workarounds can be found here: https://github.com/joblib/joblib/issues/1017

    Setting n_jobs=1 is equal to using the simple for loop, so there should be no problems.

    As for simple and straightforward solution - you can make a log file for each spawned worker and merge them afterwards.