In my DATA
below, I want to create a New_Column
which pastes the names of columns A:F
in which there is a Y
to achieve my Desired_Output
.
I was wondering if this might be possible in R?
DATA <- read.table(h=TRUE,t="
ID A B C D E F
1 Y Y N N N N
2 N N N Y N N
3 Y N Y N Y N
4 Y Y N N N N
5 N N N Y N N
6 Y Y Y N N N
7 N N N N N N
8 NA NA NA NA NA NA")
Desired_Output <- read.table(h=TRUE,t="
ID A B C D E F New_Column
1 Y Y N N N N A_B
2 N N N Y N N D
3 Y N Y N Y N A_C_E
4 Y Y N N N N A_B
5 N N N Y N N D
6 Y Y Y N N N A_B_C
7 N N N N N N N
8 NA NA NA NA NA NA NA")
You can do this in a couple of steps with the tidyverse:
library(tidyverse)
new_col <- DATA |>
pivot_longer(-ID) |>
filter(value == 'Y') |>
group_by(ID) |>
summarize(new_column = paste(name, collapse = '_'))
new_df <- left_join(DATA, new_col, by = 'ID')
ID A B C D E F new_column
1 1 Y Y N N N N A_B
2 2 N N N Y N N D
3 3 Y N Y N Y N A_C_E
4 4 Y Y N N N N A_B
5 5 N N N Y N N D
6 6 Y Y Y N N N A_B_C
7 7 N N N N N N <NA>
8 8 <NA> <NA> <NA> <NA> <NA> <NA> <NA>