I have a tibble (my_tibble) that has two columns day, and rained. Day is the day number and rained is an integer value equal to 1 if it rained on that day and equal to zero if it did not rain. I want to calculate a third column, days_since, that represents the number of days (excluding the current day) since it has rained. After a lot of trial and error, I came up with a script that yields the desired output (see reprex below), however, it's not very elegant, and I was curious if this could be achieved using the purrr package.
My brute force approach calculated a helper column, temp, that represented the days (number) that it rained. Then I subtracted that number from the day number to get days_since, except in the case when it rained on the current day, in that case I subtracted lead(temp) from the day to get days_since.
library(tidyverse)
library(dplyr)
library(reprex)
my_tibble <- tibble(day = 1:10, rained = c(1, 0, 1, 0, 0, 1, 0, 0, 0, 1))
my_tibble <- my_tibble %>%
arrange(desc(day)) %>%
mutate(seed = 1L) %>%
mutate(rows_that_are_1 = day * rained) %>%
mutate(temp = if_else(row_number() == 1, NA_real_, rows_that_are_1)) %>%
fill(temp, .direction = c('up')) %>%
mutate(temp = if_else(temp ==0, NA_real_, temp)) %>%
fill(temp, .direction = c('up')) %>%
select(day, rained, temp) %>%
mutate(days_since = if_else(day-temp >0, day-temp, day-lead(temp))) %>%
replace_na(list(days_since =0)) %>%
select(day, rained, days_since)
print(my_tibble)
#> # A tibble: 10 × 3
#> day rained days_since
#> <int> <dbl> <dbl>
#> 1 10 1 4
#> 2 9 0 3
#> 3 8 0 2
#> 4 7 0 1
#> 5 6 1 3
#> 6 5 0 2
#> 7 4 0 1
#> 8 3 1 2
#> 9 2 0 1
#> 10 1 1 0
Created on 2024-01-28 with reprex v2.0.2
The sbtscs()
function in the stevemisc
package will do this for you, except you'll have to modify the first day's value. The function is meant to be used with multiple cross-sectional units over time, so you would have to make a cross-sectional identifier if you don't have one arlready. That's what obs
is below.
library(dplyr)
library(stevemisc)
my_tibble <- tibble(day = 1:10, rained = c(1, 0, 1, 0, 0, 1, 0, 0, 0, 1))
my_tibble %>%
mutate(obs = 1) %>%
sbtscs(., event=rained, tvar=day, csunit = obs) %>%
mutate(spell = ifelse(day == 1, 0, spell + 1)) %>%
select(-obs) %>%
rename(days_since = spell)
#> # A tibble: 10 × 3
#> day rained days_since
#> <int> <dbl> <dbl>
#> 1 1 1 0
#> 2 2 0 1
#> 3 3 1 2
#> 4 4 0 1
#> 5 5 0 2
#> 6 6 1 3
#> 7 7 0 1
#> 8 8 0 2
#> 9 9 0 3
#> 10 10 1 4
Created on 2024-01-28 with reprex v2.0.2