When I try to log to a file from multithreaded loop, I just get empty file.
Minimal program to illustrate the issue is following:
import logging
import time
from joblib import Parallel, delayed
def worker(index):
time.sleep(0.5)
logging.info('Hi from myfunc {}'.format(index))
time.sleep(0.5)
def main():
logging.basicConfig(filename='multithread_test.log', level=logging.INFO, format='%(relativeCreated)6d %(threadName)s %(message)s')
Parallel(n_jobs=4)(delayed(worker)(m) for m in range(1, 8))
if __name__ == '__main__':
main()
But when I set, n_jobs=1
instead of 4
, I got expected output:
1636 MainThread Hi from myfunc 1
2656 MainThread Hi from myfunc 2
3676 MainThread Hi from myfunc 3
4696 MainThread Hi from myfunc 4
5716 MainThread Hi from myfunc 5
6736 MainThread Hi from myfunc 6
7756 MainThread Hi from myfunc 7
It's a known issue (workers cant't fetch logging state and params), which currently has several workarounds like
loky
to anything elseSome details on the problem and possible workarounds can be found here: https://github.com/joblib/joblib/issues/1017
Setting n_jobs=1
is equal to using the simple for
loop, so there should be no problems.
As for simple and straightforward solution - you can make a log file for each spawned worker and merge them afterwards.