I used the following code until today on Python 2.7 to parallelize the creation of many PNG pictures with matplotlib
. Today I tried to move everything on Python 3.8 and the part that I cannot adapt involves the parallelizatio done with multiprocessing
.
The idea is that I have a script which needs to produce several images with similar settings from different timesteps of a data file. As the plotting routine can be parametrized I'm executing it over chunks of 10 timesteps distributed among different tasks to speed up the process.
Here is the relevant part of the script which I'm not going to paste given its length.
from multiprocessing import Pool
from functools import partial
def main():
# arguments to be passed to the plotting functions
# contain data and information about the plot
args = dict(m=m, x=x, y=y, ax=ax,
winds_10m=winds_10m, mslp=mslp, ....)
# chunks of timesteps
dates = chunks(time, 10)
# partial version of the function plot_files(), see underneath
plot_files_param = partial(plot_files, **args)
p = Pool(8)
p.map(plot_files_param, dates)
def plot_files(dates, **args):
first = True
for date in dates:
#loop over dates, retrieve data from args, e.g. args['mslp'] and do the plotting
if __name__ == "__main__":
import time
start_time = time.time()
main()
elapsed_time=time.time()-start_time
print_message("script took " + time.strftime("%H:%M:%S", time.gmtime(elapsed_time)))
This used to work fine on Python 2.7 but now I get this error
Traceback (most recent call last):
File "plot_winds10m.py", line 135, in <module>
main()
File "plot_winds10m.py", line 79, in main
p.map(plot_files_param, dates)
File "lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
File "lib/python3.8/multiprocessing/pool.py", line 537, in _handle_tasks
put(task)
File "lib/python3.8/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
TypeError: cannot pickle '_thread.lock' object
the only thing that changed, besides the Python version and the packages versions, is the system. I'm testing this on MacOS instead than Linux, but it should not make a big difference especially since this is all running inside a conda environment.
Does anyone have an idea on how to fix this?
(here is the link to the github repo https://github.com/guidocioni/icon_forecasts/blob/master/plotting/plot_winds10m.py )
I figured out the problem in case anyone arrives here desperate for an answer.
The problem is that some of the conversion that I was doing using metpy.unit_array
produce a pint
array which for some reason is not pickable
. When I was then passing this array in the args
of the partial
function I was getting the error.
Trying instead to do the conversion with .convert_units()
or just extracting the array part from the data (either with .values
or .magnitude
) ensured that I was passing only a numpy
array or a DataArray
and these object are pickable.