group = c(1,1,4,4,4,5,5,6,1,4,6)
animal = c('a','b','c','c','d','a','b','c','b','d','c')
sleep = c(14,NA,22,15,NA,96,100,NA,50,2,1)
test = data.frame(group, animal, sleep)
print(test)
group_animal = test %>% group_by(`group`, `animal`) %>% summarise(mean_sleep = mean(sleep, na.rm = T))
I would like to replace the NA values the sleep column based on the mean sleep value grouped by group and animal.
Is there any way that I can perform some sort of lookup like Excel that matches group and animal from the test dataframe to the group_animal dataframe and replaces the NA value in the sleep column from the test df with the sleep value in the group_animal df?
We could use mutate
instead of summarise
as summarise
returns a single row per group
library(dplyr)
library(tidyr)
test <- test %>%
group_by(group, animal) %>%
mutate(sleep = replace_na(sleep, mean(sleep, na.rm = TRUE))) %>%
ungroup
-output
test
# A tibble: 11 × 3
group animal sleep
<dbl> <chr> <dbl>
1 1 a 14
2 1 b 50
3 4 c 22
4 4 c 15
5 4 d 2
6 5 a 96
7 5 b 100
8 6 c 1
9 1 b 50
10 4 d 2
11 6 c 1