In order to do some time series analysis, I want to use a dataframe that looks like this:
data <- data.frame (Store_ID = as.character(c(seq( 1, length.out = 10),
seq( 1, length.out = 9),
c(1,2,3,4,6,7,8,9))),
amount_sold = c(seq( 1, 9, length.out = 27)),
date = c(rep(as.Date("2015-01-01"),10),
rep(as.Date("2015-01-02"),9),
rep(as.Date("2015-01-03"),8)
)
)
As you can see, there are 10 Store_ID's for the first date (2015-01-01), but only 9 for the next date and 8 for the last date.
For my analysis I need to add the Store_ID's that are missing for the next two days. As a result I want to have 30 rows and a "0" as amount_sold for the missing Store_ID's.
Try
library(tidyr)
data <- data.frame (Store_ID = as.character(c(seq( 1, length.out = 10),
seq( 1, length.out = 9),
c(1,2,3,4,6,7,8,9))),
amount_sold = c(seq( 1, 9, length.out = 27)),
date = c(rep(as.Date("2015-01-01"),10),
rep(as.Date("2015-01-02"),9),
rep(as.Date("2015-01-03"),8)
)
) %>%
complete(Store_ID, date, fill = list(amount_sold = 0))