I am looking to split 10 images into 2 parts each (20 resulting images). The images are 4-band (R,G,B,nIR) NAIP imagery available from this website. I am using the arcpy
package from ArcGIS to split one image at a time:
import arcpy, os
inws = r'D:\temp\temp_NAIP' #Contains ~10 .tif images
outws = r'D:\temp\temp_NAIP_tiles'
arcpy.env.workspace = inws
rasters = arcpy.ListRasters()
for ras in rasters:
arcpy.SplitRaster_management(
ras, outws,
os.path.basename(ras).split('.')[0],
split_method='NUMBER_OF_TILES',
format='TIFF',
num_rasters='1 2',
overlap=50, units='PIXELS)
How can I integrate the multiprocessing
module into the above script to process, say, 4 images at a time?
Btw, I am aware of a blog post that combines multiprocessing
and arcpy
, although the examples are specific to vector data and I cannot figure out how to utilize the tools to process imagery.
Barring any resource sharing issues, converting a simple for-loop into multiprocessing is easy to do with a multiprocessing.Pool
. Try something like this:
from multiprocessing import Pool
import arcpy, os
inws = r'D:\temp\temp_NAIP' #Contains ~10 .tif images
outws = r'D:\temp\temp_NAIP_tiles'
arcpy.env.workspace = inws
rasters = arcpy.ListRasters()
def process_img(ras):
arcpy.SplitRaster_management(
ras, outws,
os.path.basename(ras).split('.')[0],
split_method='NUMBER_OF_TILES',
format='TIFF',
num_rasters='1 2',
overlap=50, units='PIXELS')
pool = Pool(processes=4)
pool.map(process_img, rasters)
So long as rasters
is an iterable, it should be mappable to a process pool. Keep in mind that each process will "inherit" the parent process' stack, such that each process will use it's own copy of arcpy.env.workspace
.