Let's say I have a tibble like
df <- tribble(
~date, ~place, ~wthr,
#------------/-----/--------
"2017-05-06","NY","sun",
"2017-05-06","CA","cloud",
"2017-05-07","NY","sun",
"2017-05-07","CA","rain",
"2017-05-08","NY","cloud",
"2017-05-08","CA","rain",
"2017-05-09","NY","cloud",
"2017-05-09","CA",NA,
"2017-05-10","NY","cloud",
"2017-05-10","CA","rain"
)
I want to check if the weather in a specific region on a specific day was same as yesterday, and attach the boolean column to df
, so that
tribble(
~date, ~place, ~wthr, ~same,
#------------/-----/------/------
"2017-05-06","NY","sun", NA,
"2017-05-06","CA","cloud", NA,
"2017-05-07","NY","sun", TRUE,
"2017-05-07","CA","rain", FALSE,
"2017-05-08","NY","cloud", FALSE,
"2017-05-08","CA","rain", TRUE,
"2017-05-09","NY","cloud", TRUE,
"2017-05-09","CA", NA, NA,
"2017-05-10","NY","cloud", TRUE,
"2017-05-10","CA","rain", NA
)
Is there a good way to do this?
To get a logical column, you check wthr
value if equal to row before using lag
after grouping by place
. I added arrange
for date to make sure in chronological order.
library(dplyr)
df %>%
arrange(date) %>%
group_by(place) %>%
mutate(same = wthr == lag(wthr, default = NA))
Edit: If you want to make sure dates are consecutive (1 day apart), you can include an ifelse
to see if the difference is 1 between date
and lag(date)
. If is not 1 day apart, it can be coded as NA
.
Note: Also, make sure your date is a Date
:
df$date <- as.Date(df$date)
df %>%
arrange(date) %>%
group_by(place) %>%
mutate(same = ifelse(
date - lag(date) == 1,
wthr == lag(wthr, default = NA),
NA))
Output
date place wthr same
<chr> <chr> <chr> <lgl>
1 2017-05-06 NY sun NA
2 2017-05-06 CA cloud NA
3 2017-05-07 NY sun TRUE
4 2017-05-07 CA rain FALSE
5 2017-05-08 NY cloud FALSE
6 2017-05-08 CA rain TRUE
7 2017-05-09 NY cloud TRUE
8 2017-05-09 CA NA NA
9 2017-05-10 NY cloud TRUE
10 2017-05-10 CA rain NA