Search code examples
rfor-loopspatialr-sprgdal

R - nested loop for list of SpatialLinesDataFrame intersected with SpatialPolygonsDataFrame objects


I have a series of steps I need to complete on a list of SpatialLinesDataFrame ('lines' herein) objects based on their relationships with individual features within a multi-feature SpatialPolygonsDataFrame ('polygons') object. In short, each line list element originates inside a single polygon feature, and may or may not pass through one or more other polygon features. I want to update each line element to connect origin polygons to the first point of contact for each individual polygon intersected by the line element. So, each line element may become multiple new line features (n=number of intersected polygons).

I would like to do this efficiently as my lines lists and polygon features are numerous. I have provided example data and STEP-wise description of what I am trying to do below. I am new to R and not a programmer, so I don't know if any of what I propose is valid.

example lines + polygons

library(sp) 
library(rgdal)
library(raster)

###example data prep START
#example 'RDCO Regional Parks' data can be downloaded here: 
https://data-rdco.opendata.arcgis.com/datasets group_ids=1950175c56c24073bb5cef3900e19460 

parks <- readOGR("/path/to/example/data/RDCO_Regional_Parks/RDCO_Regional_Parks.shp")
plot(parks)

#subset watersheds for example
parks_sub <- parks[parks@data$Shapearea > 400000,]
plot(parks_sub, col='green', axes = T)

#create SpatialLines from scratch
pts_line1 <- cbind(c(308000, 333000), c(5522000, 5530000))
line1 <- spLines(pts_line1, crs = crs(parks_sub))
plot(line1, axes=T, add=T) #origin polygon = polyl[[4]] = OBJECTID 181

pts_line2 <- cbind(c(308000, 325000), c(5524000, 5537000))
line2 <- spLines(pts_line2, crs = crs(parks_sub))
plot(line2, axes=T, add=T) #origin polygon = polyl[[8]] = OBJECTID 1838

linel <- list()
linel[[1]] <- line1
linel[[2]] <- line2

#convert to SpatialLinesDataFrame objects
lineldf <- lapply(1:length(linel), function(i) SpatialLinesDataFrame(linel[[i]], data.frame(id=rep(i, length(linel[[i]]))), match.ID = FALSE))  

#match id field value with origin polygon
lineldf[[1]]@data$id <- 181
lineldf[[2]]@data$id <- 1838
###example data prep END

#initiate nested for loop
for (i in 1:length(lineldf)) {
  for (j in 1:length(parks_sub[j,])) {

#STEP 1:for each line list feature (NB: with ID matching origin polygon ID) 
#identify whether it intersects with a polygon list feature
    if (tryCatch(!is.null(intersect(lineldf[[i]], parks_sub[j,])), error=function(e) return(FALSE)) == 'FALSE'){
     next
    }
#if 'FALSE', go on to check intersect with next polygon in list
#if 'TRUE', go to STEP 2

#STEP 2: add intersected polygon OBJECTID value to SLDF new column in attribute table
#i.e., deal with single intersected polygon at a time
    else {
      lineldf[[i]]@data["id.2"] = parks_sub[j,]@data$OBJECTID

#STEP 3: erase portion of line overlapped by intersected SPDF
      line_erase <- erase(lineldf[[i]],parks_sub[j,])

#STEP 4: erase line feature(s) that no longer intersect with the origin polygon
#DO NOT KNOW HOW TO SELECT feature (i.e., line segment) within 'line_erase' object
      if (tryCatch(!is.null(intersect(line_erase[???], parks_sub[j,])), error=function(e) return(FALSE)) == 'FALSE'){
        line_erase[???] <- NULL}
      else {

 #STEP 5: erase line features that only intersect with origin polygon
           if (line_erase[???]@data$id.2 = parks_sub[j,]@data$OBJECTID){
             line_erase[???] <- NULL
           }
              else {
 #STEP 6: write valid line files to folder
        writeOGR(line_erase, 
                 dsn = paste0("path/to/save/folder", i, ".shp"),
                 layer = "newline",
                 driver = 'ESRI Shapefile',
                 overwrite_layer = T)
      }}}}}

Solution

  • Here's a solution using the sf package. I'll work with the objects you create and convert them to sf, although you can read shapefiles into sf objects and create them from scratch so if there's no other reason to use sp objects I'd recommend that.

    Use these packages:

    library(sf)
    library(dplyr)
    

    Convert polygons. I'm dropping a load of columns from parks_sub just so it can print neater - if you need them don't drop them:

    p = st_as_sf(parks_sub)
    p = p[,c("OBJECTID","PARK_NAME")]
    

    Convert lines. I'm only going to work with your first line object, write a loop over your list to do a whole set:

    l1 = st_as_sf(lineldf[[1]])
    

    Next step is to compute all the intersection points between your line and your polygons. You have to: a) convert the polygons to linestrings otherwise the intersection of a polygon and a line is a line, and b) convert the "MULTIPOINT" intersections when a line crosses a polygon more than once into a set of POINT objects using st_cast:

    pts = st_cast(st_cast(st_intersection(l1,
                 st_cast(p,"MULTILINESTRING")
                 ),"MULTIPOINT"),"POINT")
    

    Next we need the first point of the line. For the data you create in the example, the line end in the polygon is actually the second point, so I'll extract that here.

    first_point = st_cast(l1$geom,"POINT")[2]
    

    If in your real data its the first point then put [1] in there. If it depends then that's another little problem.

    Now compute the distances from that first point to all the intersection points:

    pts$d_first = st_distance(first_point, pts)[1,]
    

    So what we want now is the nearest intersection point in each group of points defined by the same polygon ID.

    near_pts = pts %>% group_by(OBJECTID)  %>% filter(d_first==min(d_first))
    

    Then the desired lines are constructed from the first point to those nearest points:

    nlines = st_cast(st_union(near_pts, first_point),"LINESTRING")
    

    Plot the polygons and the lines in decreasing width to show the overlap:

    plot(p$geom)
    plot(nlines$geom, lwd=c(10,7,4), col=c("black","grey","white"), add=TRUE)
    

    enter image description here

    Note the three lines include a short one (in white) from the first point to the boundary of the polygon it is in - if you don't want this you can filter out the point with the nearest distance from the data frame before constructing the lines - but that assumes the first point is inside a polygon...

    nlines retains the attributes of the polygons the line intersects, as well as the ID of the line:

    > nlines
    Simple feature collection with 3 features and 4 fields
    geometry type:  LINESTRING
    dimension:      XY
    bbox:           xmin: 310276 ymin: 5522728 xmax: 333000 ymax: 5530000
    epsg (SRID):    26911
    proj4string:    +proj=utm +zone=11 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
       id OBJECTID      PARK_NAME     d_first                       geometry
    1 181     2254  Mission Creek  6794.694 m LINESTRING (326528.6 552792...
    2 181     1831    Glen Canyon 23859.161 m LINESTRING (310276 5522728,...
    3 181     1838 Black Mountain  1260.329 m LINESTRING (331799.6 552961...
    

    so now wrap all that into a function and loop that over your lines and job done!?