Search code examples
rggplot2openstreetmaposmdata

Why do I get neighboring states when mapping in R with osmdata?


I'm attemping to generate a map of the state of Louisiana and the parishes (i.e., counties) within it using osmdata and ggplot in R. This is where I'm at currently:

library(osmdata)
library(ggplot2)

louisiana <- opq(getbb("Louisiana"))
parishes <- osmdata_sf(add_osm_feature(louisiana, key = "admin_level", value = "6"))

mapla <- ggplot() +
  geom_sf(data = parishes$osm_multipolygons) +
  theme_void()

Which returns the following:

enter image description here

This correctly includes Louisiana with all of its parishes but also includes counties in the neighboring states of Mississippi, Arkansas, and Texas. What do I need to do to just get Louisiana?

Relatedly... What I first tried was the following:

louisiana <- opq(getbb("Louisiana"))
state <- osmdata_sf(add_osm_feature(louisiana, key = "admin_level", value = "4"))

mapla <- ggplot() +
  geom_sf(data = state$osm_multipolygons) +
  theme_void()

And what I got in return was this, which includes the neighboring states in their entirety:

enter image description here

So it seems that something about the way I'm approaching this is causing my to get back whatever I'm looking for with the addition of anything that's neighboring on the same administrative level.


Solution

  • @Chris is rigth, it's all about bounding box of the Louisiana state. Overpass (which is used by {osmdata} to get the data) returns all features which intersects the bounding box. The features are complete. Therefore you have surrounding states as well. The easiest way is to intersects (sf::st_filter()) the results with Louisiana boundary (admin_level = 4 and name == Louisiana).

    l <- osmdata::getbb("Louisiana")
    
    b <- osmdata::opq(bbox = l, timeout = 60*20) |>
      osmdata::add_osm_feature(key = "boundary", value = "administrative") |>
      osmdata::add_osm_feature(key = "admin_level", value = c("4", "6")) |>
      osmdata::osmdata_sf() |>
      osmdata::unname_osmdata_sf()
    
    l_state <- b$osm_multipolygons |>
      subset(admin_level == 4 & name == "Louisiana")
    
    parishes <- b$osm_multipolygons |>
      sf::st_filter(l_state, .predicate=sf::st_within) |>
      subset(admin_level == 6)
    
    tmap::tm_shape(parishes) +
      tmap::tm_borders() +
      tmap::tm_shape(l_state) +
      tmap::tm_borders(col = "red")
    

    Edit:

    Let me extend the answer and try to explain the way it works:

    In first step we are asking for bounding box of Louisiana:

    l <- osmdata::getbb("Louisiana")
    

    The bounding box is returned as a matrix

    l
    #>         min       max
    #> x -94.04319 -88.75833
    #> y  28.85429  33.01959
    

    This bounding box later is used by Overpass as a spatial filter for our query.

    Let me convert it to polygon for maping/drawing purposes and draw it over Louisiana:

    whole_bbox <- l |>
      sfext::as_bbox(crs = "EPSG:4326") |>
      sfext::sf_bbox_to_sf()
    
    tmap::tm_shape(whole_bbox) +
      tmap::tm_borders(col = "blue", lwd = 3) +
      tmap::tm_shape(l_state) +
      tmap::tm_borders(col = "red") +
      tmap::tm_basemap()
    

    Now, lets query for parishes within our bounding box and draw them on the map:

    all_parishes <- osmdata::opq(bbox = l, timeout = 60*20) |>
      osmdata::add_osm_feature(key = "boundary", value = "administrative") |>
      osmdata::add_osm_feature(key = "admin_level", value = c("6")) |>
      osmdata::osmdata_sf() |>
      osmdata::unname_osmdata_sf()
    
    tmap::tm_shape(all_parishes$osm_multipolygons) +
      tmap::tm_borders(col = "green") +
      tmap::tm_shape(whole_bbox) +
      tmap::tm_borders(col = "blue", lwd = 3) +
      tmap::tm_shape(l_state) +
      tmap::tm_borders(col = "red") +
      tmap::tm_basemap()
    

    As you can see, Overpass returned all parishes which are touched by bounding box (in fact, all which spatially intersects with bounding box) including those outside Louisiana. Therefore we have somehow filter out only those, which are interesting. As there is no common tag which can be used for filtering, we are using spatial filter (sf::st_filter()) to get only those which are within Louisiana state (st_within).

    Hope it makes sense :).

    Created on 2024-03-20 with reprex v2.1.0