Search code examples
rdataframezootibblemodelr

Expand one column in a tibble and have the descriptive columns follow


I want to use zoo::na.approx (but not married to this function) to fill in a response variable for the missing days in my dataframe. I'm having a tough time figuring out how to add the NAs to the original dataframe so that na.approx can fill them in.

My dataframe looks something like this:

 df<-data.frame(trt=c("A", "A", "A", "A", "B", "B", "B", "B"),
                day = c(1,3,7,9,1,5,8,9),
                value = c(7,12,5,7,5,6,11,8),
                stringsAsFactors=FALSE)

I want every day to be in the dataframe with "NA" for each day where I don't have data.

I have used something like this to expand my dataset:

library(dplyr)

days_possible <- expand.grid(
  day = seq(from=min(df$day), max(df$day), by=1),
  trt = c("A", "B"), 
  stringsAsFactors = FALSE
   )

new_df<- df %>%
   right_join(days_possible, by = c("trt", "day"))

My problem is that I have a bunch of sites, years, and a few treatment columns, so somewhere it seems to all get messed up and in my days_possible dataframe, I don't get it right.

Is there a function to avoid this mess, expand one column, and have all the other columns expand in a tidy fashion? I'm looking at modelr::data_grid, but by itself I am not sure how to get the final desired result - an ordered dataframe I can group by treatments and use approximation to fill in missing days.


Solution

  • We can use the complete and full_seq functions from the tidyr package. The final as.data.frame() is not required. I just added it to print the output as a data frame.

    library(tidyr)
    
    df2 <- df %>% 
      complete(trt, day = full_seq(day, period = 1)) %>%
      as.data.frame()
    
    df2
    #    trt day value
    # 1    A   1     7
    # 2    A   2    NA
    # 3    A   3    12
    # 4    A   4    NA
    # 5    A   5    NA
    # 6    A   6    NA
    # 7    A   7     5
    # 8    A   8    NA
    # 9    A   9     7
    # 10   B   1     5
    # 11   B   2    NA
    # 12   B   3    NA
    # 13   B   4    NA
    # 14   B   5     6
    # 15   B   6    NA
    # 16   B   7    NA
    # 17   B   8    11
    # 18   B   9     8