Given time series and frequency data like a dat1, which contained event_id and frequency of each event times.
To convert it sequential wide long data such as dat2, What is the most elegant way by R?
dat1 id event_no event_id times P001 1 A 3 P001 2 B 1 P001 3 C 2 P001 4 D 5 P002 1 A 5 P002 2 B 3 P002 3 C 1 P002 4 D 1 P002 5 E 1
dat2 id t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 P001 A A A B C C D D D D D P002 A A A A A B B B C D E
Thanks
Using dplyr
and tidyr
, we can first repeat rows using uncount
, then create a unique row after grouping by id
and use pivot_wider
to convert data into wide format.
library(dplyr)
library(tidyr)
df %>%
uncount(times) %>%
group_by(id) %>%
mutate(event_no = paste0("t", row_number())) %>%
pivot_wider(names_from = event_no, values_from = event_id)
#Use spread in older version of tidyr
#spread(event_no, event_id)
# id t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11
# <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct>
#1 P001 A A A B C C D D D D D
#2 P002 A A A A A B B B C D E
data
df <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L), .Label = c("P001", "P002"), class = "factor"), event_no = c(1L,
2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L), event_id = structure(c(1L, 2L,
3L, 4L, 1L, 2L, 3L, 4L, 5L), .Label = c("A", "B", "C", "D", "E"
), class = "factor"), times = c(3L, 1L, 2L, 5L, 5L, 3L, 1L, 1L,
1L)), class = "data.frame", row.names = c(NA, -9L))