Search code examples
rdata-transformdata-handling

Insert rows corresponding to each month in an yearly data


I have a dataframe with yearly data from 2014 to 2018. I wish to expand this dataframe into monthly values and basically divide each variable's value by 12 as each month's value. Please note: There is no month column in my dataframe as of now. So, if there are 5 products, I have 5*5 rows and 5 columns :"year", "Product_ID", "Var1", "Var2" and "Var3" as the columns.

Eventually, I want 5*12 rows and 6 column with "month" inserted as well.

I have tried this code but it isnt working:

df_new$date <- NA
df_new <- complete(df,Product_ID, date = full_seq(2014,1))

Any suggestions?


Solution

  • One option is to use uncount to repeat rows 12 times, create a new column month to take 1:12 value for each year and then divide Var columns by 12.

    library(dplyr)
    library(tidyr)
    
    df %>%
      uncount(12) %>%
      group_by(year) %>%
      mutate(month = 1:12) %>%
      mutate_at(vars(Var1, Var2), ~./12)
    
    # Groups:   year [3]
    #    year Product_ID  Var1  Var2 month
    #   <int> <chr>      <dbl> <dbl> <int>
    # 1  2013 A          0.833     5     1
    # 2  2013 A          0.833     5     2
    # 3  2013 A          0.833     5     3
    # 4  2013 A          0.833     5     4
    # 5  2013 A          0.833     5     5
    # 6  2013 A          0.833     5     6
    # 7  2013 A          0.833     5     7
    # 8  2013 A          0.833     5     8
    # 9  2013 A          0.833     5     9
    #10  2013 A          0.833     5    10
    # … with 26 more rows
    

    Or another option with complete and fill

    df %>%
      mutate(month = 1) %>%
      complete(year, month = 1:12) %>%
      fill(Product_ID, Var1, Var2) %>%
      mutate_at(vars(Var1, Var2), ~./12)
    

    data

    df <- data.frame(year = 2013:2015, Product_ID = c("A", "B", "C"), 
          Var1 = c(10, 20, 30), Var2 = c(60, 80, 120), stringsAsFactors = FALSE)