Search code examples
rgeospatialrasterrgdalspdf

Can you combine polygons within a SpatialPolygonsDataFrame by values in the dataframe?


I am trying to use the Australian Bureau of Statistics shapefile for Remoteness 2016 - downloading the ESRI shapefile.

I want to combine all of the polygons that are no Major Cities of Australia.

library(rgdal)
library(dplyr)
RA_2016 <- readOGR(".", layer = "RA_2016_AUST")
###to simplify only using NSW
RA_2016 <- RA_2016[RA_2016$STE_CODE16 == 1, ]

The data frame has 5 columns. I don't need any of this data once I have created a variable for whether it is a Major City or not.

MajorCity <- data.frame(10:14, c("Major City", "Regional", "Regional", "Regional", "Regional"))
names(MajorCity) <- c("RA_CODE16", "Bigsmoke")

RA_2016@data <- left_join(RA_2016@data, MajorCity)

What I want to do now is to merge the polygons that have MajorCity == "Regional". I do not need any of the original data from RA_2016. But I want it to remain a SpatialPolygonsDataFrame with the column "Bigsmoke".

In the next part of the code I am going to combine it with the Australian Bureau of Statistics' LGA data (basically so I can split LGAs into regional parts and major city parts - where they are split). So I think I need to keep that minimal amount of data.

Is there a good way to do this? Is there another post I have failed to find that will show me the way?


Solution

  • Here is how I would do that:

    library(raster)
    
    RA_2016 <- shapefile("RA_2016_AUST.shp")
    RA_2016 <- RA_2016[RA_2016$STE_CODE16 == 1, ]
    
    MajorCity <- data.frame(RA_CODE16=10:14, Bigsmoke=c("Major City", "Regional", "Regional", "Regional", "Regional"))
    
    m <- merge(RA_2016, MajorCity)
    x <- aggregate(m, "Bigsmoke")
    x
    #class       : SpatialPolygonsDataFrame 
    #features    : 2 
    #extent      : 140.9993, 159.1092, -37.50508, -28.15702  (xmin, xmax, ymin, ymax)
    #crs         : +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs 
    #variables   : 1
    #names       :   Bigsmoke 
    #min values  : Major City 
    #max values  :   Regional 
    

    Avoid writing to slots "@", like you do here: left_join(RA_2016@data, MajorCity) That could change the order and the number of records such that the attributes ould no longer match the geometries