I have a dataframe that looks like this:
DF<-data.frame(ID=c(seq(1,10, by = 1)),
C1=c(9.1,0,9,9.2,9.1,0,0,9.3,9.1,9.0) ,
grp = c(1,2,2,3,4,5,5,5,6,7)
) %>%
group_by(grp) %>%
mutate (
C1m = mean(C1),
)
The C1m values in groups 2 and 5 are unreliable because at least one C1 value is equal to 0. I am trying to get the variable C2 which for group 2 and 5 will be from the preceding group (1 and 4 respectively).
This is the output:
ID C1 grp C1m C2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 9.1 1 9.1 9.1
2 2 0 2 4.5 9.1
3 3 9 2 4.5 9.1
4 4 9.2 3 9.2 9.2
5 5 9.1 4 9.1 9.1
6 6 0 5 3.1 9.1
7 7 0 5 3.1 9.1
8 8 9.3 5 3.1 9.1
9 9 9.1 6 9.1 9.1
10 10 9 7 9 9
Does anyone know an efficient way to solve this?
Thank you!
Please try the below code
DF<-data.frame(ID=c(seq(1,10, by = 1)),
C1=c(9.1,0,9,9.2,9.1,0,0,9.3,9.1,9.0) ,
grp = c(1,2,2,3,4,5,5,5,6,7)
) %>%
mutate(C2x=ifelse(C1==0,NA_real_,C1)) %>% fill(C2x) %>% group_by(grp) %>%
mutate (
C1m = round(mean(C1),digits = 1),
C2 = round(mean(C2x),digits = 1),
) %>% select(-C2x)
Created on 2023-07-12 with reprex v2.0.2
# A tibble: 10 × 5
# Groups: grp [7]
ID C1 grp C1m C2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 9.1 1 9.1 9.1
2 2 0 2 4.5 9.1
3 3 9 2 4.5 9.1
4 4 9.2 3 9.2 9.2
5 5 9.1 4 9.1 9.1
6 6 0 5 3.1 9.2
7 7 0 5 3.1 9.2
8 8 9.3 5 3.1 9.2
9 9 9.1 6 9.1 9.1
10 10 9 7 9 9