I have this data set, with values for twins within families:
zyg fid x_t1 x_t2 y_t1 y_t2
1 499474 NA 1 1 NA
1 499474 NA NA NA NA
1 499474 NA NA NA 1
1 499474 NA NA NA NA
1 499540 NA NA 1 NA
1 499540 NA NA NA NA
2 499874 NA NA NA NA
2 499874 NA NA 1 NA
2 499874 NA NA NA 1
2 499874 2 NA NA 1
The expected for family 499479 is:
zyg fid x_t1 x_t2 y_t1 y_t2
1 499474 NA 1 1 1
and for family 499874, it should be:
2 499874 2 NA 1 1
You can use the following code:
library(dplyr)
df %>%
group_by(fid) %>%
summarise_all(~first(na.omit(.)))
Output:
# A tibble: 3 × 6
fid zyg x_t1 x_t2 y_t1 y_t2
<int> <int> <int> <int> <int> <int>
1 499474 1 NA 1 1 1
2 499540 1 NA NA 1 NA
3 499874 2 2 NA 1 1
Your data:
df<-structure(list(zyg = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), fid = c(499474L, 499474L, 499474L, 499474L, 499540L, 499540L,
499874L, 499874L, 499874L, 499874L), x_t1 = c(NA, NA, NA, NA,
NA, NA, NA, NA, NA, 2L), x_t2 = c(1L, NA, NA, NA, NA, NA, NA,
NA, NA, NA), y_t1 = c(1L, NA, NA, NA, 1L, NA, NA, 1L, NA, NA),
y_t2 = c(NA, NA, 1L, NA, NA, NA, NA, NA, 1L, 1L)), class = "data.frame", row.names = c(NA,
-10L))