Search code examples
gispostgisshapefilepolygonsarcmap

Shapefile with overlapping polygons: calculate average values


I have a very big polygon shapefile with hundreds of features, often overlapping each other. Each of these features has a value stored in the attribute table. I simply need to calculate the average values in the areas where they overlap. I can imagine that this task requires several intricate steps: I was wondering if there is a straightforward methodology. I’m open to every kind of suggestion, I can use ArcMap, QGis, arcpy scripts, PostGis, GDAL… I just need ideas. Thanks!


Solution

  • After few attempts, I found a solution by rasterising all the features singularly and then performing cell statistics in order to calculate the average. See below the script I wrote, please do not hesitate to comment and improve it! Thanks!

        #This script processes a shapefile of snow persistence (area of interest: Afghanistan).
        #the input shapefile represents a month of snow cover and contains several features.
        #each feature represents a particular day and a particular snow persistence (low,medium,high,nodata)
        #these features are polygons multiparts, often overlapping.
        #a feature of a particular day can overlap a feature of another one, but features of the same day and with
        #different snow persistence can not overlap each other.
        #(potentially, each shapefile contains 31*4 feature).
        #the script takes the features singularly and exports each feature in a temporary shapefile
        #which contains only one feature.
        #Then, each feature is converted to raster, and after
        #a logical conditional expression gives a value to the pixel according the intensity (high=3,medium=2,low=1,nodata=skipped).
        #Finally, all these rasters are summed and divided by the number of days, in order to
        #calculate an average value.
        #The result is a raster with the average snow persistence in a particular month.
        #This output raster ranges from 0 (no snow) to 3 (persistent snow for the whole month)
        #and values outside this range should be considered as small errors in pixel overlapping.
        #This script needs a particular folder structure. The folder C:\TEMP\Afgh_snow_cover contains 3 subfolders
        #input, temp and outputs. The script takes care automatically of the cleaning of temporary data
    
    
        import arcpy, numpy, os
        from arcpy.sa import *
        from arcpy import env
    
        #function for finding unique values of a field in a FC
        def unique_values_in_table(table, field):
                data = arcpy.da.TableToNumPyArray(table, [field])
                return numpy.unique(data[field])
    
        #check extensions
        try:
            if arcpy.CheckExtension("Spatial") == "Available":
                arcpy.CheckOutExtension("Spatial")
            else:
                # Raise a custom exception
                #
                raise LicenseError
        except LicenseError:
            print "spatial Analyst license is unavailable"  
        except:
            print arcpy.GetMessages(2)
        finally:
            # Check in the 3D Analyst extension
            #
            arcpy.CheckInExtension("Spatial")
    
        # parameters and environment
        temp_folder = r"C:\TEMP\Afgh_snow_cover\temp_rasters"
        output_folder = r"C:\TEMP\Afgh_snow_cover\output_rasters"
        env.workspace = temp_folder
        unique_field = "FID"
        field_Date = "DATE"
        field_Type = "Type"
        cellSize = 0.02
    
    
        fc = r"C:\TEMP\Afgh_snow_cover\input_shapefiles\snow_cover_Dec2007.shp"
    
        stat_output_name = fc[-11:-4] + ".tif"
    
        #print stat_output_name
        arcpy.env.extent = "MAXOF"
    
        #find all the uniquesID of the FC
        uniqueIDs = unique_values_in_table(fc, "FID")
    
        #make layer for selecting
        arcpy.MakeFeatureLayer_management (fc, "lyr")
        #uniqueIDs = uniqueIDs[-5:]
        totFeatures = len(uniqueIDs)
        #for each feature, get the date and the type of snow persistence(type can be high, medium, low and nodata)
        for i in uniqueIDs:
            SC = arcpy.SearchCursor(fc)
            for row in SC:
                if row.getValue(unique_field) == i:
                    datestring = row.getValue(field_Date)
                    typestring = row.getValue(field_Type)
    
            month = str(datestring.month)
            day = str(datestring.day)
            year = str(datestring.year)
    
        #format month and year string
            if len(month) == 1:
                month = '0' + month
    
            if len(day) == 1:
                day = '0' + day
    
        #convert snow persistence to numerical value
            if typestring == 'high':
                typestring2 = 3
            if typestring == 'medium':
                typestring2 = 2
            if typestring == 'low':
                typestring2 = 1
            if typestring == 'nodata':
                typestring2 = 0
        #skip the NoData features, and repeat the following for each feature (a feature is a day and a persistence value)
            if typestring2 > 0:
                    #create expression for selecting the feature
                    expression = ' "FID" = ' + str(i) + ' '
                    #select the feature
                    arcpy.SelectLayerByAttribute_management("lyr", "NEW_SELECTION", expression)
                    #create 
                    #outFeatureClass = os.path.join(temp_folder, ("M_Y_" + str(i)))
                    #create faeture class name, writing the snow persistence value at the end of the name
                    outFeatureClass = "Afg_" + str(year) + str(month) + str(day) + "_" + str(typestring2) + '.shp'
                    #export the feature
                    arcpy.FeatureClassToFeatureClass_conversion("lyr", temp_folder, outFeatureClass)
                    print "exported FID " + str(i) + " \ " + str(totFeatures)
                    #create name of the raster and convert the newly created feature to raster
                    outRaster = outFeatureClass[4:-4] + ".tif"
                    arcpy.FeatureToRaster_conversion(outFeatureClass, field_Type, outRaster, cellSize)
                    #remove the temporary fc
                    arcpy.Delete_management(outFeatureClass)
            del SC, row
        #now many rasters are created, representing the snow persistence types of each day. 
        #list all the rasters created 
        rasterList = arcpy.ListRasters("*", "All")
        print rasterList
    
        #now the rasters have values 1 and 0. the following loop will
        #perform CON expressions in order to assign the value of snow persistence
        for i in rasterList:
                print i + ":"
                inRaster = Raster(i)
                #set the value of snow persistence, stored in the raster name
                value_to_set = i[-5]
                inTrueRaster = int(value_to_set)
                inFalseConstant = 0
                whereClause = "Value > 0"
    
    
                # Check out the ArcGIS Spatial Analyst extension license
                arcpy.CheckOutExtension("Spatial")
                print 'Executing CON expression and deleting input'
                # Execute Con , in order to assign to each pixel the value of snow persistence
                print str(inTrueRaster)
                try:
                        outCon = Con(inRaster, inTrueRaster, inFalseConstant, whereClause)
                except:
                        print 'CON expression failed (probably empty raster!)'
    
                nameoutput = i[:-4] + "_c.tif"
                outCon.save(nameoutput)
                #delete the temp rasters with values 0 and 1
                arcpy.Delete_management(i)
        #list the raster with values of snow persistence
        rasterList = arcpy.ListRasters("*_c.tif", "All")
        #sum the rasters
        print "Caclulating SUM"
        outCellStats = CellStatistics(rasterList, "SUM", "DATA")
        #calculate the number of days (num of rasters/3)
        print "Calculating day ratio"
        num_of_rasters = len(rasterList)
        print 'Num of rasters : ' + str(num_of_rasters)
        num_of_days = num_of_rasters / 3
        print 'Num of days : ' + str(num_of_days)
        #in order to store decimal values, multiplicate the raster by 1000 before dividing
        outCellStats = outCellStats * 1000 / num_of_days
        #save the output raster
        print "saving output " + stat_output_name
        stat_output_name = os.path.join(output_folder,stat_output_name)
        outCellStats.save(stat_output_name)
        #delete the remaining temporary rasters
        print "deleting CON rasters"
        for i in rasterList:
                print "deleting " + i
                arcpy.Delete_management(i)
        arcpy.Delete_management("lyr")