I have a function which I'm running using a concurrent.futures.ProcessPoolExecutor
. The function looks up a key in a dictionary. The dictionary is a global variable. The function makes no modification to the dictionary itself.
However, any changes I make to the dictionary inside the if __name__ == '__main__:
block are not recognized by the process pool workers even though the modifications occur before any of the workers are started.
Here's my MRE:
import concurrent.futures
NUM_KEYS=30
D={}
def func(key):
return D[key]
if __name__ == '__main__':
for k in range(NUM_KEYS):
D[k]=k*10
executor=concurrent.futures.ProcessPoolExecutor(max_workers=4)
results={k: executor.submit(func,k) for k in range(NUM_KEYS)}
for future in results.values():
print(k,future.result())
Each worker fails with a KeyError
.
It looks like the entire file gets re-ran for each subprocess. In those processes, __name__
is no longer equal to '__main__'
but rather '__mp_main__'
. Therefore, the code which added all of the keys to the dictionary wasn't being run.
I'll change my program to explicitly pass the global variable to the function as an argument since, in my actual code, it's costly to compute.