I want to recode several year columns using a range specified by 'StartYear' and 'CloseYear'.
What's an elegant way to get from here:
library(tibble); library(dplyr)
(df <- tibble(id = c(1,2,3, 4),
`1997` = c(1,0,0, 1),
`1998` = c(0,1,0, 0),
`1999` = c(0,0,1, 0),
`2000` = c(0, 0, 1, 1),
StartYear = c(1998, 1997, 1998, 1998),
CloseYear = c(1999, 1997, 2000, 1999)))
#> # A tibble: 4 x 7
#> id `1997` `1998` `1999` `2000` StartYear CloseYear
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 0 0 0 1998 1999
#> 2 2 0 1 0 0 1997 1997
#> 3 3 0 0 1 1 1998 2000
#> 4 4 1 0 0 1 1998 1999
To here:
(tibble(id = c(1,2,3, 4),
`1997` = c(0, 1, 0, 0),
`1998` = c(1, 0, 1, 1),
`1999` = c(1, 0, 1, 1),
`2000` = c(0, 0, 1, 0),
StartYear = c(1998, 1997, 1998, 1998),
CloseYear = c(1999, 1997, 2000, 1999)))
#> # A tibble: 4 x 7
#> id `1997` `1998` `1999` `2000` StartYear CloseYear
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0 1 1 0 1998 1999
#> 2 2 1 0 0 0 1997 1997
#> 3 3 0 1 1 1 1998 2000
#> 4 4 0 1 1 0 1998 1999
And is there a decent way using dplyr
/ the dplyr::mutate
functions?
One possible tidyverse approach. Gather, mutate, then spread...
library(tidyverse)
df %>%
gather(year, value, -id, -StartYear, -CloseYear, convert = TRUE) %>%
mutate(value = as.integer(StartYear <= year & year <= CloseYear)) %>%
spread(year, value)
#> # A tibble: 4 x 7
#> id StartYear CloseYear `1997` `1998` `1999` `2000`
#> <dbl> <dbl> <dbl> <int> <int> <int> <int>
#> 1 1 1998 1999 0 1 1 0
#> 2 2 1997 1997 1 0 0 0
#> 3 3 1998 2000 0 1 1 1
#> 4 4 1998 1999 0 1 1 0