I have two dataframes
df1 <- setNames(data.frame(c(as.POSIXct("2022-07-29 00:00:00","2022-07-29 00:05:00","2022-07-29 00:10:00","2022-07-29 00:15:00","2022-07-29 00:20:00")), c(1,2,3,4,5)), c("timeStamp", "value"))
df2 <- setNames(data.frame(c(as.POSIXct("2022-07-29 00:00:05","2022-07-29 00:05:05","2022-07-29 00:20:05")), c("a","b","c")), c("timeStamp", "text"))
I want to merge them into a single data frame by timeStamp, where the times are matched in chronological order, and the mismatching columns are simply filled in with NA. So it would read
timestamp value text
"2022-07-29 00:00:00" 1 NA
"2022-07-29 00:00:05" NA a
...
I can't for my life figure out why simply merge(x,y,by=) doesn't work for this. neither does left_join. It seems so simple and I can't seem to figure it out.
Also, a bonus would be to make it work this way as well, in which each time there is a change in text, all the rows below it remain the last text until the text changes again prompted by df2:
timestamp value text
"2022-07-29 00:00:00" 1 NA
"2022-07-29 00:00:05" NA a
"2022-07-29 00:05:00" 2 a
"2022-07-29 00:05:05" NA b
"2022-07-29 00:10:00" 3 b
"2022-07-29 00:15:00" 4 b
"2022-07-29 00:20:00" 5 b
"2022-07-29 00:20:05" NA c
Thanks
You need to do a full join.
tidyverse solution:
df1 %>%
full_join(df2, c("timeStamp" = "timeStamp")) %>%
arrange(timeStamp) %>%
tidyr::fill(text)
timeStamp value text
1 2022-07-29 00:00:00 1 <NA>
2 2022-07-29 00:00:05 NA a
3 2022-07-29 00:05:00 2 a
4 2022-07-29 00:05:05 NA b
5 2022-07-29 00:10:00 3 b
6 2022-07-29 00:15:00 4 b
7 2022-07-29 00:20:00 5 b
8 2022-07-29 00:20:05 NA c
Base R merge:
merge(df1, df2, by = c("timeStamp" = "timeStamp"), all = T)