Search code examples
rdataframetidyversetranspose

Transforming Categorical Column into Binary Columns in R Based on Multiple Conditions


I have a dataframe with two columns in R. One of the columns (column1) has three possible values (A, A and B, B). The rows are patients.

I want to tranpose the column1, so I'd have binary columns (Yes, No), in case they have one of the two possible values in column1 (A, B)

So for example, for dataframe:

column1 | column2
A       | NA
A       | NA
A and B | NA
A and B | NA
B       | NA
B       | NA

Would give:

column1_A  | column1_B | column2
Yes        | No        |  NA
Yes        | No        |  NA
Yes        | Yes       |  NA
Yes        | Yes       |  NA
No         | Yes       |  NA
No         | Yes       |  NA

Repex:

library(dplyr)
library(tidyr)

df <- data.frame(
  column1 = c("A", "A", "A and B", "A and B", "B", "B"),
  column2 = NA
)

Please, keep in mind that this is a simplified version. I'd like to be able to transpose no matter the number of single values in a given column


Solution

  • vals <- c("A", "B")
    var <- "column1"
    
    df[paste(var, vals, sep="_")] <- sapply(vals,
                    \(x) ifelse(grepl(x, df[,var]), "Yes","No"))
    df
    

      column1 column2 column1_A column1_B
    1       A      NA       Yes        No
    2       A      NA       Yes        No
    3 A and B      NA       Yes       Yes
    4 A and B      NA       Yes       Yes
    5       B      NA        No       Yes
    6       B      NA        No       Yes
    

    If you have more values to search, just add them to the vals vector:

    vals <- c("A", "B", "C")
    

    Run the above code again to get:

    df
      column1 column2 column1_A column1_B column1_C
    1       A      NA       Yes        No        No
    2       A      NA       Yes        No        No
    3 A and B      NA       Yes       Yes        No
    4 A and B      NA       Yes       Yes        No
    5       B      NA        No       Yes        No
    6       B      NA        No       Yes        No