Search code examples
rggplot2spatialgeom-sf

speeding it up: plotting an area using the extent of multiple sf objects together


I have three sf objects: one is multilinestring (creeks) another multipolygon (properties) and a third polygon (coastline). These are all big files, a creek might cross several properties, several creeks might be distributed around a large area, and all are within the coastline, which is large.

creeks: Simple feature collection with 1927331 features and 25 fields Geometry type: MULTILINESTRING

properties: Simple feature collection with 2 features and 25 fields Geometry type: MULTIPOLYGON

coastline: Simple feature collection with 8567 features and 1 field Geometry type: POLYGON

I can plot like this:

ggplot() +
  geom_sf(data = creeks) +
  geom_sf(data = properties)

but I clearly have no coastline, which I would like.

I can plot them like this:

ggplot() +
  geom_sf(data = creeks) +
  geom_sf(data = properties) +
  geom_sf(data = coastline, fill = NA)

and they plot to the scale of the coastline, which is too large.

I can plot them like this:

ggplot() +
  geom_sf(data = creeks) +
  geom_sf(data = properties) +
  geom_sf(data = st_crop(coastline, st_union(creeks, properties)), fill = NA)

and I get a map of appropriate size, but it takes several minutes to plot because, I suppose, of the union operation. Also, I find the boundary is too close to the creek and property data (no buffer).

I also tried this:

pl_area <- st_buffer(st_union(creeks, properties), 10000)
ggplot() +
  geom_sf(data = creeks) +
  geom_sf(data = properties) +
  geom_sf(data = st_crop(coastline, pl_area), fill = NA)

and this works, but it takes even more minutes!

I wonder if someone has a solution where I can plot my map at the right size/resolution and doesn't take forever? My first though was I might be able to set an extent using a union of bounding boxes or something.. but if so, I don't know how. Definietly not like this:

st_buffer(
  st_union(
    st_bbox(creeks), st_bbox(properties)),
  10000)

The spatial data I have don't have the same column names etc, so rbind doesn't work. I am open to the idea that I could align their column names and then rbind, but I was hoping for neater.

Does anyone have a ideas?


Solution

  • To union bboxes, you should convert those to geometries first. But you can also construct your own bbox by combining appropriate minimum and maximum values (it's simplest form is just a named vector of 4 values). Example bellow uses shapefile from sf package and takes 2 subsets to simulate a somewhat similar case.

    library(sf)
    #> Linking to GEOS 3.11.2, GDAL 3.6.2, PROJ 9.2.0; sf_use_s2() is TRUE
    library(ggplot2)
    library(patchwork)
    
    nc = read_sf(system.file("shape/nc.shp", package="sf"))
    subset_1 <- dplyr::filter(nc, dplyr::between(CNTY_ID, 1800, 1850 ))
    subset_2 <- dplyr::filter(nc, dplyr::between(CNTY_ID, 1900, 1950 ))
    
    (bb_1 <- st_bbox(subset_1))
    #>      xmin      ymin      xmax      ymax 
    #> -81.74107  35.99472 -75.77316  36.58965
    (bb_2 <- st_bbox(subset_2))
    #>      xmin      ymin      xmax      ymax 
    #> -82.96275  35.50681 -76.69376  36.25709
    
    # convert bboxes to sfc, then union
    bb_union <- st_union(st_as_sfc(bb_1),st_as_sfc(bb_2))
    st_as_text(bb_union)
    #> [1] "POLYGON ((-82.96275 35.50681, -76.69376 35.50681, -76.69376 36.01401, -75.77316 35.99472, -75.77316 36.58965, -81.74107 36.58965, -81.74107 36.28277, -82.96275 36.25709, -82.96275 35.50681))"
    
    # or combine minimums of *xmin* & *ymin* pairs and maximums of *xmax* & *ymax* pairs:
    (bb_merged <- c(pmin(bb_1[1:2], bb_2[1:2]), 
                    pmax(bb_1[3:4], bb_2[3:4])))
    #>      xmin      ymin      xmax      ymax 
    #> -82.96275  35.50681 -75.77316  36.58965
    
    
    p1 <- ggplot() +
      geom_sf(data = nc) +
      geom_sf(data = subset_1, fill = "darkred") +
      geom_sf(data = subset_2, fill = "darkgreen") +
      geom_sf(data = bb_union, linewidth = 1, fill = "gold", alpha = .2) +
      theme_minimal()
    
    p2 <- ggplot() +
      geom_sf(data = st_crop(nc, bb_merged)) +
      geom_sf(data = subset_1, fill = "darkred") +
      geom_sf(data = subset_2, fill = "darkgreen") +
      ggtitle("st_crop(nc, bb_merged))") +
      theme_minimal()
    #> Warning: attribute variables are assumed to be spatially constant throughout
    #> all geometries
    
    p3 <- ggplot() +
      geom_sf(data = st_crop(nc, bb_union)) +
      geom_sf(data = subset_1, fill = "darkred") +
      geom_sf(data = subset_2, fill = "darkgreen") +
      ggtitle("st_crop(nc, bb_union)") +
      theme_minimal()
    #> Warning: attribute variables are assumed to be spatially constant throughout
    #> all geometries
      
    p1/p2/p3
    

    You could also consider simplification to speed up plotting (sf::st_simplify() , rmapshaper::ms_simplify() ). And you may need to adjust when working with different projections / dataums, either directly or indirectly -- note how bbox defined by 4 corners can cut into geometries in this example.