Search code examples
rcopygroup

R: Copying value for one type of group and year downwards and upwards


I created a simplified version of my data since my actual data is too large to be include here.

structure(list(Name = c("A", "A", "A", "A", "B", "B", "B", "B", 
"C", "C", "C", "C"), Category = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), Year = c(2018L, 2019L, 2020L, 2021L, 2018L, 
2019L, 2020L, 2021L, 2018L, 2019L, 2020L, 2021L), Cost = c(NA, 
100L, NA, NA, 99L, 45L, 34L, NA, 64L, 34L, NA, 34L)), class = "data.frame", row.names = c(NA, 
-12L))

The focus here is on the column "Cost" . If there is NA, in a first step, I use the most recent year with available data in the respective category.

Data %>%
  group_by(`Category`, Year) %>%
  dplyr:: mutate(Cost = na.locf(Cost))

The problem now is that for "Name" A, there is no value in the "Year" 2018. That's why I get an error. How do I need to adjust, the code, that then in a second step, if there is no data available from prior year for this respective "Category", I use the data from the next available year. In this specific case it would be "Year" 2019 and the value 100.

Thank you for your help in advance.


Solution

  • Use tidyr::fill with .direction = "downup":

    library(tidyr)
    library(dplyr)
    
    Data %>% 
      group_by(Category) %>% 
      fill(Cost, .direction = "downup")
    
    # A tibble: 12 × 4
    # Groups:   Category [2]
       Name  Category  Year  Cost
       <chr>    <int> <int> <int>
     1 A            1  2018   100
     2 A            1  2019   100
     3 A            1  2020   100
     4 A            1  2021   100
     5 B            2  2018    99
     6 B            2  2019    45
     7 B            2  2020    34
     8 B            2  2021    34
     9 C            2  2018    64
    10 C            2  2019    34
    11 C            2  2020    34
    12 C            2  2021    34