I have code using a ProcessPoolExecutor
which can't pickle lambdas and functions. Some of the code that I want to execute in parallel uses a defaultdict
with a default value of None
.
How would you proceed? If at all possible, I would not like to touch the parallelizing code.
What I have:
class SomeClass:
def __init__(self):
self.some_dict = defaultdict(lambda: None)
def generate(self):
<some code>
def some_method_to_parallelize(x: SomeClass):
<some code>
def some_method():
max_workers = round(os.cpu_count() // 1.5)
invocations_per_process = 100
with ProcessPoolExecutor(max_workers=max_workers) as executor:
data = [executor.submit(some_method_to_parallelize, SomeClass())] for _ in range(invocations_per_process)]
data = list(itertools.chain.from_iterable([r.result() for r in data]))
Try:
collections.defaultdict(type(None))
That gets you a reference to NoneType
for use as your defaultdict
's default factory. When constructed, it produces None
, and unlike a lambda
, appears to be picklable.