I'd like to generate a fully paneled (monthly) time series
I tried tsibble
which works great for large data though for small sets with lots of missing data it seems to choose very wide intervals.
Also, for easy comparison of many different sets, I'd like to specify a start and end date.
library(dplyr)
data <- structure(list(
month = structure(c(18078, 18201), class = "Date"),
account = c("3125", "3100"),
sum = c(-21.0084, -2000)),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -2L))
data %>%
mutate(month = tsibble::yearmonth(month)) %>%
tsibble::as_tsibble(key = account, index = month) %>%
tsibble::fill_gaps(sum = 0, .full = T)
Here I have a minimal example which results in
# A tibble: 4 x 3
month account sum
<mth> <chr> <dbl>
1 2019 Jul 3100 0
2 2019 Nov 3100 -2000
3 2019 Jul 3125 -21.0
4 2019 Nov 3125 0
but should start in May to December with 0 for each missing month per group (account).
library(dplyr, warn.conflicts = FALSE)
library(tsibble, warn.conflicts = FALSE)
data <- structure(list(
month = structure(c(18078, 18201), class = "Date"),
account = c("3125", "3100"),
sum = c(-21.0084, -2000)),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -2L))
data %>%
mutate(month = yearmonth(month)) %>%
as_tsibble(key = account, index = month) %>%
full_join(
tibble(
month = seq(as.Date("2019-05-01"), as.Date("2019-12-01"), by = "1 month")
)
) %>%
fill_gaps(sum = 0, .full = TRUE) %>%
filter(account != is.na(account)) %>%
print(n = 20)
#> Joining, by = "month"
#> # A tsibble: 16 x 3 [1M]
#> # Key: account [2]
#> month account sum
#> <mth> <chr> <dbl>
#> 1 2019 May 3100 0
#> 2 2019 Jun 3100 0
#> 3 2019 Jul 3100 0
#> 4 2019 Aug 3100 0
#> 5 2019 Sep 3100 0
#> 6 2019 Oct 3100 0
#> 7 2019 Nov 3100 -2000
#> 8 2019 Dec 3100 0
#> 9 2019 May 3125 0
#> 10 2019 Jun 3125 0
#> 11 2019 Jul 3125 -21.0
#> 12 2019 Aug 3125 0
#> 13 2019 Sep 3125 0
#> 14 2019 Oct 3125 0
#> 15 2019 Nov 3125 0
#> 16 2019 Dec 3125 0
Created on 2020-01-15 by the reprex package (v0.3.0)