I have a simple question for data in the format below (date
is dd/m/yy). Code for data input is below the question.
#> date a b
#> 1 25/1/20 10 20
#> 2 26/1/20 20 40
How can I add rows for all of January, with 0 values for a
and b
?
My desired output is:
#> date a b
#> 1 01/1/20 0 0
#> 2 02/1/20 0 0
#> ...
#> ...
#> 25/1/20 10 20
#> 26/1/20 20 40
Tidyverse answers are preferred, but I'm happy with any solution.
Code for data input:
df <- data.frame(
stringsAsFactors = FALSE,
date = c("25/1/20", "26/1/20"),
a = c(10L, 20L),
b = c(20L, 40L)
)
You could use a new dataframe and then join. The key is to create a new dataframe based on the max date on your df
and define a sequence since the first day. Here the code:
library(dplyr)
library(lubridate)
#Data
df <- data.frame(
stringsAsFactors = FALSE,
date = c("25/1/20", "26/1/20"),
a = c(10L, 20L),
b = c(20L, 40L)
)
#Code 1
df %>% mutate(date=dmy(date)) -> df
#Identify max date
maxdate <- max(df$date)
#Set initial day
startday <- as.Date(paste0(format(maxdate,'%Y-%m'),'-01'))
#New data frame
ndf <- data.frame(date=seq(startday,maxdate,by=1))
#Join
ndf %>% left_join(df) %>% replace(is.na(.),0)
Output:
date a b
1 2020-01-01 0 0
2 2020-01-02 0 0
3 2020-01-03 0 0
4 2020-01-04 0 0
5 2020-01-05 0 0
6 2020-01-06 0 0
7 2020-01-07 0 0
8 2020-01-08 0 0
9 2020-01-09 0 0
10 2020-01-10 0 0
11 2020-01-11 0 0
12 2020-01-12 0 0
13 2020-01-13 0 0
14 2020-01-14 0 0
15 2020-01-15 0 0
16 2020-01-16 0 0
17 2020-01-17 0 0
18 2020-01-18 0 0
19 2020-01-19 0 0
20 2020-01-20 0 0
21 2020-01-21 0 0
22 2020-01-22 0 0
23 2020-01-23 0 0
24 2020-01-24 0 0
25 2020-01-25 10 20
26 2020-01-26 20 40