Search code examples
rtextdummy-variable

How can you make a dummy variable based on part of a character?


I want to make dummy variables based on if a specific word is present in a column. I included an example to clarify it:

source/medium           qr_dummy

Amsterdam/qr_code          0 
Rotterdam/offline          0
Utrecht/online             0

I want to have a 1 if qr_code is present in the source/medium column. I tried the code below, but because "qr_code" is not matching the exact characters it wont give a 1.

df$qr_code_dummy[df$sourceMedium == "qr_code"] <- 1

So the wanted outcome looks as follows:

source/medium           qr_dummy

Amsterdam/qr_code          1 
Rotterdam/offline          0
Utrecht/online             0


Solution

  • As mentioned, grepl is a good choice. Here's an example using dplyr with the ifelse to change booleans to 0 and 1.

    library(dplyr)
    df <- data.frame(sourceMedium = c('Amsterdam/qr_code','Rotterdam/offline','Utrecht/online'))
    summary <- df %>% mutate(qr_code_dummy = ifelse(grepl('qr_code', sourceMedium), 1, 0))
    summary
    
    #       sourceMedium qr_code_dummy
    # 1 Amsterdam/qr_code            1
    # 2 Rotterdam/offline            0
    # 3    Utrecht/online            0