I would like to know how to make a new column called "order" where the column is automatically filled up with sequences of integer (-3:3). The sequence is centered at "eventdate" and thus, on the "eventdate" the value of order is zero.
I have the following table.
df<-structure(list(Date = structure(c(15047L, 15048L, 15049L, 15050L,
15051L, 15052L, 15053L, 15054L, 15142L, 15143L, 15144L, 15145L,
15146L, 15147L, 15148L, 15119L, 15120L, 15121L, 15122L, 15123L,
15124L, 15125L, 15126L, 15497L, 15498L, 15499L, 15500L, 15501L,
15502L, 15503L, 15504L, 15505L, 15506L), class = c("IDate", "Date"
)), ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L), eventdate = structure(c(NA, NA, NA, NA, 15051L,
NA, NA, NA, NA, NA, NA, 15145L, NA, NA, NA, NA, NA, NA, 15122L,
NA, NA, NA, NA, NA, NA, NA, NA, 15501L, NA, 15503L, NA, NA, NA
), class = c("IDate", "Date")), Value = c(10L, 11L, 12L, 11L,
15L, 17L, 18L, 15L, 20L, 21L, 22L, 25L, 26L, 25L, 26L, 5L, 6L,
7L, 9L, 10L, 11L, 15L, 16L, 20L, 22L, 23L, 24L, 28L, 29L, 25L,
24L, 26L, 24L)), row.names = c(NA, -33L), class = c("data.table",
"data.frame"))
Again, I would like to make a new column called "order" so that on the day when eventdate is not NA, the order of it is zero. And then, I would like to fill up negative consecutively decreasing integers to the previous rows and positive consecutive increasing integers to the below rows per ID. Again, I only need the previous three rows to the event date and three rows after the event date, which makes seven rows per eventdate, since the seven rows include the eventdate itself. Please note that there could be multiple eventdates per ID.
Also, note that the two eventdates for the ID 3 occurs too frequently. There is only a day between the two eventdates. In this case, I would like to ignore the upcoming eventdate (2012-06-12) and create three consecutive and increasing order integers as if nothing happened on 2012-06-12. Similarly, on 2012-06-12, I would like to ignore eventdate(2012-06-10) and create three consecutive and decreasing order integers prior to 2012-06-12. It would make more sense to compare the current table and the desired output.
Below is the desired output.
output<-structure(list(Date = structure(c(15048L, 15049L, 15050L, 15051L,
15052L, 15053L, 15054L, 15142L, 15143L, 15144L, 15145L, 15146L,
15147L, 15148L, 15119L, 15120L, 15121L, 15122L, 15123L, 15124L,
15125L, 15498L, 15499L, 15500L, 15501L, 15502L, 15503L, 15504L,
15500L, 15501L, 15502L, 15503L, 15504L, 15505L, 15506L), class = c("IDate",
"Date")), ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), eventdate = structure(c(NA,
NA, NA, 15051L, NA, NA, NA, NA, NA, NA, 15145L, NA, NA, NA, NA,
NA, NA, 15122L, NA, NA, NA, NA, NA, NA, 15501L, NA, NA, NA, NA,
NA, NA, 15503L, NA, NA, NA), class = c("IDate", "Date")), Value = c(11L,
12L, 11L, 15L, 17L, 18L, 15L, 20L, 21L, 22L, 25L, 26L, 25L, 26L,
5L, 6L, 7L, 9L, 10L, 11L, 15L, 22L, 23L, 24L, 28L, 29L, 25L,
24L, 24L, 28L, 29L, 25L, 24L, 26L, 24L), order = c(-3L, -2L,
-1L, 0L, 1L, 2L, 3L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, -3L, -2L,
-1L, 0L, 1L, 2L, 3L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, -3L, -2L,
-1L, 0L, 1L, 2L, 3L)), row.names = c(NA, -35L), class = c("data.table",
"data.frame"))
Thank you very much in advance!
We may filter
the elements in 'eventdate' that have non-NA values after creating row sequence column and then use the sequence column to subset the corresponding column values from the original 'df' after expanding based on the sequence (-3:3
)
library(dplyr)
library(tidyr)
out2 <- df %>%
mutate(rn = row_number()) %>%
filter(complete.cases(eventdate)) %>%
rowwise %>%
mutate(order = list(-3:3), rn = list(rn + order)) %>%
ungroup %>%
unnest(where(is.list)) %>%
mutate(across(c("Date", "eventdate", "Value"),
~ df[[cur_column()]][rn])) %>%
select(-rn) %>%
mutate(eventdate = case_when(order == 0 ~ eventdate))
-checking with expected output
> all.equal(out2, output, check.attributes = FALSE)
[1] TRUE