I'm trying to use multiprocessing in python (2.7.8) on a Mac OSX. After reading Velimir Mlaker's answer to this question, I was able to use multiprocessing.Pool() to multiprocess a trivially simple function but it doesn't work with my actual function. I get the right results but it executes serially. I believe the problem is that my function loops over a music21.stream() which is similar to a list but has special functionality for music data. I believe that music21 streams cannot be pickled so is there some multiprocessing alternative to pool that I can use? I don't mind if results are returned out of order and I can upgrade to a different version of python if necessary. I've included my code for the multiprocessing task but not for the stream_indexer() function it calls. Thank you!
import multiprocessing as mp
def basik(test_piece, part_numbers):
jobs = []
for i in part_numbers:
# Each 2-tuple in jobs has an index <i> and a music21 stream that
# corresponds to an individual part in a musical score.
jobs.append((i, test_piece.parts[i]))
pool = mp.Pool(processes=4)
results = pool.map(stream_indexer, jobs)
pool.close()
pool.join()
return results
The newest git commits of music21
have features to help with some of the trickier parts of multiprocessing, based on joblib. For instance if you want to count all the notes in a part, you can normally do in serial:
import music21
def countNotes(s):
return len(s.recurse().notes)
# using recurse() instead of .flat to avoid certain caches...
bach = music21.corpus.parse('bach/bwv66.6')
[countNotes(p) for p in bach.parts]
in parallel it works like this:
music21.common.runParallel(list(bach.parts), countNotes)
BUT! here's the huge caveat. Let's time these:
In [5]: %timeit music21.common.runParallel(list(b.parts), countNotes)
10 loops, best of 3: 152 ms per loop
In [6]: %timeit [countNotes(p) for p in b.parts]
100 loops, best of 3: 2.19 ms per loop
On my computer (2 core, 4 thread), running in parallel is nearly 100x SLOWER than running in serial. Why? Because there's a significant overhead to preparing a Stream to be multiprocessed. If the routine being run is very slow (around 1ms/note divided by number of processors) then it's worth passing Streams around in multiprocessing. Otherwise, see if there are ways to only pass back and forth small bits of information, such as the path to process:
def parseCountNotes(fn):
s = corpus.parse(fn)
return len(s.recurse().notes)
bach40 = [b.sourcePath for b in music21.corpus.search('bwv')[0:40]]
In [32]: %timeit [parseCountNotes(b) for b in bach40]
1 loops, best of 3: 2.39 s per loop
In [33]: %timeit music21.common.runParallel(bach40, parseCountNotes)
1 loops, best of 3: 1.83 s per loop
Here we're beginning to get speedups even on a MacBook Air. On my office Mac Pro, the speedups get huge for calls such as this. In this case, the call to parse
massively dominates over the time to recurse()
.