Search code examples
rggplot2geom-sfgratia

Adding a geom_sf to a long-lat plot with gratia package in R


Using the draw funtction of gratia with a model that contains a smooth s(longitude, latitude) will plot a long-lat plot and effect contours. That's very nice!

I want to add a country shape to the plot

library(giscoR)

vatican <- gisco_get_countries(resolution = "10", country = "VAT") %>%
  mutate(res = "10M")

Plotting the shape with ggplot works

ggplot() +
  geom_sf(data=vatican)

but with gratia not so much.

draw(model,
     select="s(longitude,latitude)") +
  geom_sf(data=vatican)

I get the error message

Coordinate system already present. Adding new coordinate system, which will replace the existing one.
Error in `geom_sf()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 5th layer.
Caused by error in `.data[["longitude"]]`:
! Column `longitude` not found in `.data`.

I'd apreciate any help how to solve this!

EDIT

Model to test is

sound <- sample(0:5, size=960, replace=T)
word <- as.factor(rep(c('alpha', 'bravo', 'charlie', 'delta', 'echo', 'foxtrot'), each=4))
age <- as.factor(rep(c('young', 'old'), times=480))
gender <- as.factor(rep(c('female', 'female', 'male', 'male'), times=240))
longitude <- rep(c(runif(40, min=41, max=42)), each=24)
latitude <- rep(c(runif(40, min=12, max=13)), each=24)
pronunciation <- data.frame(sound, word, age, gender, longitude, latitude)

library(mgcv)

model = gam(list(sound ~ word + s(longitude, latitude),
                       ~ word + s(longitude, latitude),
                       ~ word + s(longitude, latitude),
                       ~ word + s(longitude, latitude),
                       ~ word + s(longitude, latitude)),
                       data=pronunciation,
                       family=multinom(K=5))

Solution

  • The developer of gratia states that the draw() function:

    was never intended to allow the sorts of customization that is possible with ggplot() or some other packages that use ggplot() as the plotting layer.

    Therefore, I doubt there is a way to plot an sf object directly using draw(). However, below are two possible solutions. Note that if you intended your example data to overlap with your vatican object, you assigned the longitude and latitude coordinates incorrectly. This is corrected in the example data used in this reprex.

    Option 1: Convert sf object to a dataframe

    • Pros: Less complex of the two options
    • Cons: Does not account for relative x/y ratio of CRS coordinates
    library(mgcv)
    library(dplyr)
    library(sf)
    library(gratia)
    library(giscoR)
    library(ggplot2)
    
    set.seed(1) # For reproducibility
    sound <- sample(0:5, size = 960, replace = TRUE)
    word <- as.factor(rep(c("alpha", "bravo", "charlie", "delta", "echo", "foxtrot"), each = 4))
    age <- as.factor(rep(c("young", "old"), times = 480))
    gender <- as.factor(rep(c("female", "female", "male", "male"), times = 240))
    longitude <- rep(c(runif(40, min = 12, max = 13)), each = 24)
    latitude <- rep(c(runif(40, min = 41, max = 42)), each = 24)
    pronunciation <- data.frame(sound, word, age, gender, longitude, latitude)
    
    model <- gam(list(sound ~ word + s(longitude, latitude),
                      ~ word + s(longitude, latitude),
                      ~ word + s(longitude, latitude),
                      ~ word + s(longitude, latitude),
                      ~ word + s(longitude, latitude)),
                 data = pronunciation,
                 family = multinom(K = 5))
    
    vatican <- gisco_get_countries(resolution = "10", country = "VAT") |>
      mutate(res = "10M")
    
    # Convert vatican sf object to df and add longitide and latitude columns (X,Y)
    vatican_df <- vatican |>
      select(geometry) |>
      st_cast("POLYGON") |>
      st_coordinates() |>
      as.data.frame()
    
    # Plot
    draw(model, select = "s(longitude,latitude)") + 
      geom_polygon(data = vatican_df,
                   aes(X, Y),
                   fill = "yellow",
                   colour = "yellow",
                   linewidth = 2) +  # linewidth exaggerated for illustrative purposes
      labs(title = "Title")
    

    1

    Option 2: Extract data from ggplot() object created by draw() and plot using ggplot()

    • Pros: easier to control individual geoms if familiar with ggplot(), possible to use geom_sf() and therefore maintains x/y coordinate ratio of sf CRS
    • Cons: more complex, may need trial and error to extract correct layers
    # Create plot object
    p <- draw(model, select = "s(longitude,latitude)")
    
    # Extract plot data from ggplot2 object
    p_df <- ggplot_build(p) 
    p_df1 <- p_df$plot$data # Estimates / partial effect
    p_df2 <- p_df$data[[2]] # Line object layer
    p_df3 <- p_df$data[[3]] # Panel with colour values
    p_df4 <- p_df$data[[4]] # Point object layer
    
    # Plot
    ggplot() +
      geom_tile(data = p_df1,
                aes(longitude, latitude, fill = .estimate)) +
      geom_line(data = p_df2,
                 aes(x, y, group = piece),
                 size = 0.3) +
      geom_point(data = p_df4,
                 aes(x, y),
                 size = 2) +
      geom_sf(data = vatican,
              fill = "yellow",
              colour = "yellow",
              linewidth = 2) +
      labs(title = "Title", caption = paste("Basis:", p_df1[1,".type"])) +
      scale_fill_gradient2(low = p_df3[1,1], midpoint = 0, mid = "white", high = p_df3[2,1],
                           name = "Partial\neffect")
    

    2

    The colour ramp is not identical to option 1, but you can set your own custom or bult-in gradient if it needs to match.