I have a data frame with year, site and pointer.
year<-c(1200, 1201, 1202, 1204)
site_a<-c("0", "1", "1", "0")
site_b<-c("0", "0", "1", "0")
site_c<-c("0", "-1", "1", "0")
df<-data.frame(year,site_a, site_b, site_c)
lw2<-pivot_longer(df, cols=site_a:site_c, names_to="site", values_to="pointer")
I want to count the number of pointer (1 and -1) in two years spans like:
intval<-2
lw2%>%
group_by(site, pointer)%>%
filter(pointer == 1 | pointer == -1)%>%
group_by(pointer, add = TRUE)%>%
summarize(startDate = seq(from=1200, to=1204, by=2),
endDate = startDate + intval-1,
frequency = n())
As a result I am getting the same, total number of pointers for each time spans,
site | pointer | startDate | endDate | frequency |
---|---|---|---|---|
site_a | 1 | 1200 | 1201 | 2 |
site_a | 1 | 1202 | 1203 | 2 |
site_a | 1 | 1204 | 1205 | 2 |
while I want to get number of pointers for each time span:
site | pointer | startDate | endDate | frequency |
---|---|---|---|---|
site_a | 1 | 1200 | 1201 | 1 |
site_a | 1 | 1202 | 1203 | 1 |
site_a | 1 | 1204 | 1205 | 0 |
Do you have suggestions on how to get this?
Here's how I would do it.
library(dplyr)
library(tidyr)
lw2 %>%
right_join(expand(., year = 1200:1205, site)) %>%
group_by(year%/%2, site) %>%
summarise(startDate = min(year),
endDate = max(year),
frequency = sum(pointer != 0, na.rm = T),
.groups = 'drop') %>%
select(-1)
#> # A tibble: 9 × 4
#> site startDate endDate frequency
#> <chr> <dbl> <dbl> <int>
#> 1 site_a 1200 1201 1
#> 2 site_b 1200 1201 0
#> 3 site_c 1200 1201 1
#> 4 site_a 1202 1203 1
#> 5 site_b 1202 1203 1
#> 6 site_c 1202 1203 1
#> 7 site_a 1204 1205 0
#> 8 site_b 1204 1205 0
#> 9 site_c 1204 1205 0