Search code examples
pythonmultiprocessingarcgisrasterarcpy

Split many ArcGIS rasters in parallel using Python multiprocessing


I am looking to split 10 images into 2 parts each (20 resulting images). The images are 4-band (R,G,B,nIR) NAIP imagery available from this website. I am using the arcpy package from ArcGIS to split one image at a time:

import arcpy, os

inws = r'D:\temp\temp_NAIP'  #Contains ~10 .tif images
outws = r'D:\temp\temp_NAIP_tiles'

arcpy.env.workspace = inws
rasters = arcpy.ListRasters()

for ras in rasters:
    arcpy.SplitRaster_management(
        ras, outws, 
        os.path.basename(ras).split('.')[0], 
        split_method='NUMBER_OF_TILES', 
        format='TIFF', 
        num_rasters='1 2',
        overlap=50, units='PIXELS)

How can I integrate the multiprocessing module into the above script to process, say, 4 images at a time?

Btw, I am aware of a blog post that combines multiprocessing and arcpy, although the examples are specific to vector data and I cannot figure out how to utilize the tools to process imagery.


Solution

  • Barring any resource sharing issues, converting a simple for-loop into multiprocessing is easy to do with a multiprocessing.Pool. Try something like this:

    from multiprocessing import Pool
    import arcpy, os
    
    inws = r'D:\temp\temp_NAIP'  #Contains ~10 .tif images
    outws = r'D:\temp\temp_NAIP_tiles'
    
    arcpy.env.workspace = inws    
    rasters = arcpy.ListRasters()
    
    def process_img(ras):
        arcpy.SplitRaster_management(
            ras, outws, 
            os.path.basename(ras).split('.')[0], 
            split_method='NUMBER_OF_TILES',
            format='TIFF', 
            num_rasters='1 2',
            overlap=50, units='PIXELS')
    
    pool = Pool(processes=4)
    pool.map(process_img, rasters)
    

    So long as rasters is an iterable, it should be mappable to a process pool. Keep in mind that each process will "inherit" the parent process' stack, such that each process will use it's own copy of arcpy.env.workspace.