Search code examples
rdata-manipulation

Filling out missing information by grouping in R


I have a sample dataset below:

df <- data.frame(id = c(1,1,2,2,3,3),
                 gender = c("male",NA,"female","female",NA, "female"))

> df
  id gender
1  1   male
2  1   <NA>
3  2 female
4  2 female
5  3   <NA>
6  3 female

By grouping the same ids, some rows are missing. What I would like to do is to fill those missing cells based on the existing information.

SO the desired output would be:

> df
  id gender
1  1   male
2  1   male
3  2 female
4  2 female
5  3 female
6  3 female

Any thoughts? Thanks!


Solution

  • You can use dplyr::group_by and tidyr::fill e.g.:

    df |>
      dplyr::group_by(id) |>
      tidyr::fill(gender, .direction = "updown")