I have a dataset with multiple dates and conditions. I would like to extract all the rows which start with condition place == "A" and all rows with start date of place == "A" and up to 7 days later. For example:
Date Place Value1 Value2
2018-10-27 C 20 8
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
2018-11-8 D 18 5
And I want:
Date Place Value1 Value2
2018-10-29 A 10 5
2018-10-31 B 15 6
2018-11-4 C 17 9
As you can see it must extract the first row with place == A and all rows within 7 days later. The places after the first day like "A" doesn't make sense, like "B" and "C". It must start with "A". It skips 2018-11-8 because that is more than 7 days from 2018-10-29.
I tried it like this question: R: Extract data based on date, "if date lesser than" , but I don't know how to extract the 7 days.
We can use match
to get corresponding Date
value and select all the rows within 7 days from it.
library(dplyr)
df %>%
mutate(Date = as.Date(Date)) %>%
filter({tmp <- Date[match('A', Place)]
between(Date, tmp, tmp + 7)})
# Date Place Value Value.1
#1 2018-10-29 A 10 5
#2 2018-10-31 B 15 6
#3 2018-11-04 C 17 9
dplyr
allows to perform operations without creating temporary variables in global environment, the above solution can be written in base R as :
df$Date <- as.Date(df$Date)
date_val <- df$Date[match('A', df$Place)]
subset(df, Date >= date_val & Date <= date_val + 7)
data
df <- structure(list(Date = structure(c(17831, 17833, 17835, 17839,
17843), class = "Date"), Place = c("C", "A", "B", "C", "D"),
Value = c(20L, 10L, 15L, 17L, 18L), Value.1 = c(8L, 5L, 6L,
9L, 5L)), row.names = c(NA, -5L), class = "data.frame")