Search code examples
rrasterterra

Convert bounds into raster in R


I have a data frame with quad keys and associated values for activity index, and I want to convert it into a raster. The quadkeyr package is too slow for the scale of my data (each hour across all of Canada), so I just want to use the terra package.

Here is a sample of the data:

structure(list(geography = c("021302200001103223", "021302200001123321", 
"021302200001111311", "021302200001110301", "021302200001100333", 
"021302200001111211"), xlon = c(-112.35512, -112.34962, -112.32491, 
-112.33864, -112.35786, -112.3304), xlat = c(50.72298, 50.70994, 
50.73254, 50.73254, 50.72994, 50.73254), bounds = c("-112.35580,50.72255,-112.35443,50.72342", 
"-112.35031,50.70950,-112.34894,50.71037", "-112.32559,50.73211,-112.32422,50.73298", 
"-112.33932,50.73211,-112.33795,50.73298", "-112.35855,50.72950,-112.35718,50.73037", 
"-112.33109,50.73211,-112.32971,50.73298"), start_date = c("2021-07-01", 
"2021-07-01", "2021-07-01", "2021-07-01", "2021-07-01", "2021-07-01"
), end_date = c("2021-07-31", "2021-07-31", "2021-07-31", "2021-07-31", 
"2021-07-31", "2021-07-31"), agg_day_period = c(0L, 0L, 1L, 1L, 
1L, 1L), agg_time_period = c(15L, 17L, 10L, 11L, 12L, 15L), activity_index_total = c(0.010026, 
0.034617, 0.02384, 0.051245, 0.023967, 0.027478)), row.names = c(NA, 
6L), class = "data.frame")

How can I convert this into a raster without using a dedicated quadkay package? I don't have data on the original resolution but I have bounds in the data frame itself.


Solution

  • You can create a polygon dataset using the coordinates in the bounds column and then rasterize() the polygons. Some data wrangling is required to create the polygons, but the rest is relatively straightforward.

    Note that based on your example data, the raster representation of each associated polygon may differ, but this issue is inherent to shape-to-raster transformations. In other words, if your polygons do not align exactly with the SpatRaster grid, terra will return an extent for each polygon that is as close as possible. I have added a plot below comparing 10 metre and 1 metre resolution to illustrate. There are other rasterize() parameters such as touches = that can help improve results.

    This reprex assumes your coordinates are WGS84 / EPSG:4326. Given the mean area of the polygons in your data is 9340.2 m^2, a 10 metre resolution seems reasonable e.g. res = 10/111320. If you need greater accuracy, 1 metre resolution for example, use res = 1/111320.

    library(sf)
    library(terra)
    library(dplyr)
    library(tidyr)
    
    # Example df
    df <- structure(list(geography = c("021302200001103223", "021302200001123321", 
    "021302200001111311", "021302200001110301", "021302200001100333", 
    "021302200001111211"), xlon = c(-112.35512, -112.34962, -112.32491, 
    -112.33864, -112.35786, -112.3304), xlat = c(50.72298, 50.70994, 
    50.73254, 50.73254, 50.72994, 50.73254), bounds = c("-112.35580,50.72255,-112.35443,50.72342", 
    "-112.35031,50.70950,-112.34894,50.71037", "-112.32559,50.73211,-112.32422,50.73298", 
    "-112.33932,50.73211,-112.33795,50.73298", "-112.35855,50.72950,-112.35718,50.73037", 
    "-112.33109,50.73211,-112.32971,50.73298"), start_date = c("2021-07-01", 
    "2021-07-01", "2021-07-01", "2021-07-01", "2021-07-01", "2021-07-01"
    ), end_date = c("2021-07-31", "2021-07-31", "2021-07-31", "2021-07-31", 
    "2021-07-31", "2021-07-31"), agg_day_period = c(0L, 0L, 1L, 1L, 
    1L, 1L), agg_time_period = c(15L, 17L, 10L, 11L, 12L, 15L), activity_index_total = c(0.010026, 
    0.034617, 0.02384, 0.051245, 0.023967, 0.027478)), row.names = c(NA, 
    6L), class = "data.frame")
    
    # Create polygon sf using coordinates from bounds column
    sf_bounds <- df |>
      separate_wider_delim(bounds, ",", names = c("lon1", "lat1", "lon2", "lat2")) |>
      mutate(across(starts_with("l"), as.numeric)) |>
      rowwise() %>%
      mutate(lon3 = min(lon1, lon2),
             lat3 = max(lat1, lat2),
             lon4 = max(lon1, lon2),
             lat4 = min(lat1, lat2)) |>
      pivot_longer(cols = starts_with("lon") | starts_with("lat"),
                   names_to = c(".value", "tmp"),
                   names_pattern = "(lon|lat)(\\d+)") |>
      st_as_sf(coords = c("lon", "lat"), crs = 4326) |>
      group_by(geography) |>
      mutate(id = c(1, 3, 2, 4))|>
      arrange(geography, id) |>
      summarise(geography = first(geography),
                activity_index_total = first(activity_index_total),
                geometry = st_combine(geometry)) |>
      st_cast("POLYGON")
    
    # Create SpatRaster template with approximately 10 metre resolution
    r <- rast(vect(sf_bounds), res = 10/111320, crs = sf_bounds)
    
    # Create SpatRaster of your data, assign values e.g. "activity_index_total"
    r1 <- rasterize(vect(sf_bounds), r, "activity_index_total")
    
    r1
    # class       : SpatRaster 
    # dimensions  : 261, 382, 1  (nrow, ncol, nlyr)
    # resolution  : 8.983112e-05, 8.983112e-05  (x, y)
    # extent      : -112.3585, -112.3242, 50.7095, 50.73295  (xmin, xmax, ymin, ymax)
    # coord. ref. : lon/lat WGS 84 (EPSG:4326) 
    # source(s)   : memory
    # name        : activity_index_total 
    # min value   :             0.010026 
    # max value   :             0.051245  
    

    Plot comparing cell extent of a single bounds polygon at 10m and 1m resolution 1