I am trying to run the following code to parallalize a function that crops geotifs. Geotifs are named as <location>__img_news1a_iw_rt30_<hex_code>_g_gpf_vv.tif
. The code works perfectly fine but it skips a particular set of geotif from even reading from the vv_tif iterable. In particular, out of locationA_img_news1a_iw_rt30_20170314t115609_g_gpf_vv.tif
, locationA_img_news1a_iw_rt30_20170606t115613_g_gpf_vv.tif
and locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
it skips locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
every single time from reading when I read these files along with other location geotifs. However, it reads this file if I create an iterable from only these three geotif files. I have tried changing chunksize but it doesn't help. Am I missing something here?
from multiprocessing import Pool, cpu_count
try:
pool = Pool(cpu_count())
pool.imap_unordered(tile_geotif, vv_tif, chunksize=11)
finally:
pool.close()
EDIT: I have 55 files in total and it only drops locationA_img_news1a_iw_rt30_20170712t115615_g_gpf_vv.tif
file every single time.
This is too much to show in comments, putting here in answer.
It seems to me that the map functions work in my toy examples below. I think you have error in your input data to cause the corrupted output. Either that, or you found a bug. If so, do try to create a reproducible example.
from multiprocessing import Pool
vv_tif = list(range(10))
def square(x):
return x**x
with Pool(5) as p:
print(p.map(square, vv_tif))
with Pool(5) as p:
print(list(p.imap(square, vv_tif)))
with Pool(5) as p:
print(list(p.imap_unordered(square, vv_tif)))
with Pool(5) as p:
print(list(p.imap_unordered(square, vv_tif, chunksize=11)))
Output:
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
[1, 1, 256, 3125, 46656, 823543, 16777216, 4, 27, 387420489]
[1, 1, 4, 27, 256, 3125, 46656, 823543, 16777216, 387420489]
Usually all 4 lines were the same. I ran it a few times till I got a different ordering on one. It looks to me that it works.
Note that his demonstrates that the various map
functions are not mutating underlying data.