Got a data.frame with a column like this:
Column_1
AAA
B
BBB
AAA_FACE
CCC
BBB_AAA
I want to spread the column into new columns (but not for all my unique values, because then I would get very, very much columns), but only for the values containing a specific pattern: "AAA".
After spreading the values, I want to make them binary, So ideally my new data.frame looks like this:
AAA AAA_FACE BBB_AAA
1 0 0
0 0 0
0 0 0
0 1 0
0 0 0
0 0 1
I tried dplyr's
spread() function. But there I got the issue that I spread the data in many, many columns (instead of only the columns containing 'AAA' pattern).
One option with tidyverse
would be
library(tidyverse)
df1 %>%
mutate(i1 = as.integer(str_detect(Column_1, "AAA")),
rn = row_number()) %>%
spread(Column_1, i1, fill = 0) %>%
select(matches("AAA"))
# AAA AAA_FACE BBB_AAA
#1 1 0 0
#2 0 0 0
#3 0 0 0
#4 0 1 0
#5 0 0 0
#6 0 0 1
It can be made a bit more efficient by replace
ing the other values to NA
and then do the spread
df1 %>%
mutate(i1 = as.integer(str_detect(Column_1, "AAA")),
Column_1 = replace(Column_1, !i1, NA),
rn = row_number()) %>%
spread(Column_1, i1, fill = 0) %>%
select(matches("AAA"))