I would like to ask the R community for help with finding a solution for my data, where any consecutive row with numerous NA values is combined and put into a new column.
For example:
df <- data.frame(A= c(1,2,3,4,5,6), B=c(2, "NA", "NA", 5, "NA","NA"), C=c(1,2,"NA",4,5,"NA"), D=c(3,"NA",5,"NA","NA","NA"))
A B C D
1 1 2 1 3
2 2 NA 2 NA
3 3 NA NA 5
4 4 5 4 NA
5 5 NA 5 NA
6 6 NA NA NA
Must be transformed to this:
A B C D E
1 1 2 1 3 2 NA 2 NA 3 NA NA 5
2 4 5 4 NA 5 NA 5 NA 6 NA NA NA
I would like to do the following:
This is quite complex (for me!) and I am wondering if anyone can offer any help with this. I have searched for some similar problems, but have been unable to find one that produces a similar desired output.
Thank you very much for your thoughts--
Using tidyr and dplyr:
NA
.summarize()
to concatenate “bad” row values to a single string.df %>%
unite("E", everything(), remove = FALSE, sep = " ") %>%
mutate(
E = if_else(
rowSums(across(!E, is.na)) > 1,
E,
""
),
new_row = cumsum(E == "")
) %>%
group_by(new_row) %>%
summarize(
across(A:D, first),
E = trimws(paste(E, collapse = " "))
) %>%
select(!new_row)
# A tibble: 2 × 5
A B C D E
<dbl> <dbl> <dbl> <dbl> <chr>
1 1 2 1 3 2 NA 2 NA 3 NA NA 5
2 4 5 4 NA 5 NA 5 NA 6 NA NA NA