Search code examples
rdplyrlag

Transfer of the average value from the previous group to the next


I have a dataframe that looks like this:

DF<-data.frame(ID=c(seq(1,10, by = 1)),
               C1=c(9.1,0,9,9.2,9.1,0,0,9.3,9.1,9.0) ,
               grp = c(1,2,2,3,4,5,5,5,6,7)
               ) %>% 
  group_by(grp) %>% 
  mutate (
    C1m = mean(C1),
  )

The C1m values in groups 2 and 5 are unreliable because at least one C1 value is equal to 0. I am trying to get the variable C2 which for group 2 and 5 will be from the preceding group (1 and 4 respectively).

This is the output:

      ID    C1   grp   C1m    C2
   <dbl> <dbl> <dbl> <dbl> <dbl>
 1     1   9.1     1   9.1   9.1
 2     2   0       2   4.5   9.1
 3     3   9       2   4.5   9.1
 4     4   9.2     3   9.2   9.2
 5     5   9.1     4   9.1   9.1
 6     6   0       5   3.1   9.1
 7     7   0       5   3.1   9.1
 8     8   9.3     5   3.1   9.1
 9     9   9.1     6   9.1   9.1
10    10   9       7   9     9 

Does anyone know an efficient way to solve this?

Thank you!


Solution

  • Please try the below code

    DF<-data.frame(ID=c(seq(1,10, by = 1)),
                   C1=c(9.1,0,9,9.2,9.1,0,0,9.3,9.1,9.0) ,
                   grp = c(1,2,2,3,4,5,5,5,6,7)
    ) %>% 
      mutate(C2x=ifelse(C1==0,NA_real_,C1)) %>% fill(C2x) %>% group_by(grp) %>% 
      mutate (
        C1m = round(mean(C1),digits = 1),
        C2 = round(mean(C2x),digits = 1),
      ) %>% select(-C2x)
    

    Created on 2023-07-12 with reprex v2.0.2

    # A tibble: 10 × 5
    # Groups:   grp [7]
          ID    C1   grp   C1m    C2
       <dbl> <dbl> <dbl> <dbl> <dbl>
     1     1   9.1     1   9.1   9.1
     2     2   0       2   4.5   9.1
     3     3   9       2   4.5   9.1
     4     4   9.2     3   9.2   9.2
     5     5   9.1     4   9.1   9.1
     6     6   0       5   3.1   9.2
     7     7   0       5   3.1   9.2
     8     8   9.3     5   3.1   9.2
     9     9   9.1     6   9.1   9.1
    10    10   9       7   9     9