Search code examples
rdplyrfill

filling missing values for all months in r


I'm having trouble with something that must be quite easy in R.

Lets say we have a data like this.

df <- read.table(text="id,date,value
1,202105,10
1,202106,5
1,202107,7
1,202108,8
1,202109,6
1,202110,1
1,202111,9
2,202110,10
2,202111,2
2,202112,4
2,202201,7",sep=",",header=TRUE)

id     date      value
1      202105    10
1      202106    5
1      202107    7   
1      202108    8 
1      202109    6 
1      202110    1 
1      202111    9 
2      202110    10
2      202111    2
2      202112    4
2      202201    7  

I would like to get data that for each id, adding all dates with NA values.

id     date      value
1      202105    10
1      202106    5
1      202107    7   
1      202108    8 
1      202109    6 
1      202110    1 
1      202111    9 
1      202112    NA    
1      202201    NA
2      202105    NA
2      202106    NA
2      202107    NA
2      202108    NA
2      202109    NA
2      202110    10
2      202111    2
2      202112    4
2      202201    7  

Thank you very much in advance!


Solution

  • Use tidyr::complete:

    tidyr::complete(df, id, date)
    
    # A tibble: 18 × 3
          id   date value
       <int>  <int> <int>
     1     1 202105    10
     2     1 202106     5
     3     1 202107     7
     4     1 202108     8
     5     1 202109     6
     6     1 202110     1
     7     1 202111     9
     8     1 202112    NA
     9     1 202201    NA
    10     2 202105    NA
    11     2 202106    NA
    12     2 202107    NA
    13     2 202108    NA
    14     2 202109    NA
    15     2 202110    10
    16     2 202111     2
    17     2 202112     4
    18     2 202201     7