Using R, I would like to count the number of points within a polygon according to a specific criterion (a temporal window).
I have the following data:
Using QGIS, I created a set of circular 50 mile buffers around each survey respondent. My goal is to count the number of political rallies that falls within each respondent's "buffer" within a specific time frame preceding the interview. The 50 mile buffers created in QGIS retain all variables of the original data, including the date of the interview.
Using QGIS, I created some mock shapefiles with dates and locations to aid in replication.
I am trying to use GISTools::poly.counts
to count the number of rallies within different temporal windows (30 days, 90 days, etc.).
Generally, to count the number of points within a polygon, I would simply use:
count <- GISTools::poly.counts(rallies, buffer)
This gives me the total number of rallies that occur within each buffer, but doesn't allow me to specify temporal windows. For example, it would be great to develop a count of the number of rallies within a buffer for the 30 days preceding the survey interview as well as the 90 days preceding the interview.
Remember, each polygon within my buffer
shapefile has a different date of interview.
Here's what I've tried, but it's not working:
buffer$count_30 <- GISTools::poly.counts(
rallies[buffer$date - rallies$date > 0 & buffer$date - rallies$date <= 30],
buffer)
I get the following error:
Error in `[.data.frame`(x@data, i, j, ..., drop = FALSE) :
undefined columns selected
In addition: Warning messages:
1: In unclass(time1) - unclass(time2) :
longer object length is not a multiple of shorter object length
2: In unclass(time1) - unclass(time2) :
longer object length is not a multiple of shorter object length
What is the correct way to accomplish this?
Another answer using sf
, but this time using spatial joins and dplyr for filtering etc.
library(tidyverse)
library(sf)
rallies <- read_sf('Downloads/stack_ex_q/rallies.shp')
# Here I don't use the supplied buffer, but make one according to the data
#fifty_buff <- read_sf('Downloads/stack_ex_q/rallies.shp')
surveys <- read_sf('Downloads/stack_ex_q/surveys.shp')
# Transform to a crs using meters as a distance & make date col a proper date
rallies <- st_transform(rallies, crs = 2163) %>%
mutate(date = as.Date(date))
surveys <- st_transform(surveys, crs = 2163) %>%
mutate(date = as.Date(date))
# make a buffer w/ 50 mile radius (80467 meters), not used but useful for visualization
buffer_50mi <- st_buffer(surveys, dist = 80467)
Plot the data for a quick visual check:
library(mapview)
mapview(rallies, col.regions = 'purple') +
mapview(surveys, col.regions = 'black') +
mapview(buffer_50mi, col.regions = 'green')
Join the data using st_is_within_distance, using 80467m = 50 miles.
joined <- surveys %>%
st_join(rallies, join = st_is_within_distance, 80467)
head(joined)
Simple feature collection with 6 features and 4 fields
geometry type: POINT
dimension: XY
bbox: xmin: 1350401 ymin: -556609 xmax: 1438586 ymax: -455743.1
projected CRS: NAD27 / US National Atlas Equal Area
# A tibble: 6 x 5
id.x date.x geometry id.y date.y
<dbl> <date> <POINT [m]> <dbl> <date>
1 1 2020-04-26 (1350401 -556609) 16 2020-02-19
2 1 2020-04-26 (1350401 -556609) 17 2020-05-12
3 2 2020-03-27 (1438586 -455743.1) 7 2020-02-18
4 2 2020-03-27 (1438586 -455743.1) 15 2020-07-01
5 2 2020-03-27 (1438586 -455743.1) 15 2020-03-28
6 3 2020-06-12 (1352585 -479940.5) 15 2020-07-01
The .x columns are from the survey sf object & the .y columns are from the rallies sf object. Geometry is retained from the survey sf.
Using dplyr's filter, group_by, and mutate, find what you're looking for. The code below counts rallies within 50 miles and +/- 60 days by survey point as an example.
joined_60days <- joined %>%
group_by(id.x) %>%
mutate(date_diff = as.numeric(date.x - date.y)) %>%
filter(!is.na(date_diff)) %>% ## remove survey points with no rallies in 50mi/60d
filter(abs(date_diff) <= 60) %>%
group_by(id.x) %>%
count()
head(joined_60days)
Simple feature collection with 4 features and 2 fields
geometry type: POINT
dimension: XY
bbox: xmin: 1268816 ymin: -556609 xmax: 1438586 ymax: -322572.4
projected CRS: NAD27 / US National Atlas Equal Area
# A tibble: 4 x 3
id.x n geometry
<dbl> <int> <POINT [m]>
1 1 1 (1350401 -556609)
2 2 2 (1438586 -455743.1)
3 3 1 (1352585 -479940.5)
4 4 2 (1268816 -322572.4)
Quick visual check: