I implemented small statistical functions, and parallelized by multiprocessing. Overall structure of the code looks like this:
def worker(args, no):
f = Stat.fit(args)
return f.result
class Stat:
def fit(self):
doing various things...
def bootstrap(self):
p = mp.Pool(mp.cpu_count())
parameter = ... #set parameters for Stat
worker = functools.partial(worker, parameter)
for i, _ in enumerate(p.imap_unordered(worker, range(1000))):
pass
So, bootstrap method in Stat class invoke processes that runs function which create instance of Stat class and run fit() methods. I guess this approach is maybe quite inefficient. Is it better to replace class with functions? Or using classes like this doesn't affect multiprocessing performances?
It's not inefficient (it won't affect performance), it's just unorthodox. It would probably be a little more cleaner if you took bootstrap
out of Stat
since it doesn't look like it benefits from being a method of that class.
def worker(args, no):
f = Stat.fit(args)
return f.result
def bootstrap(self):
p = mp.Pool(mp.cpu_count())
parameter = ... #set parameters for Stat
worker = functools.partial(worker, parameter)
for i, _ in enumerate(p.imap_unordered(worker, range(1000))):
pass
class Stat:
def fit(self):
doing various things...