Search code examples
rstringgrepl

Flag when character appears more than once in a string


I have seen something similar answered for Python but not for R. Say I have the sample data below, and I want to create the "want" column, which flags when the character "|" appears more than once in the string in the "var1" column. How would I do this in R? I know I can use grepl to flag whenever "|" appears, but this would also capture when it only appears once.

Sample data:

var1<-c("BLUE|RED","RED|BLUE","WHITE|BLACK|ORANGE","BLACK|WHITE|ORANGE")
want<-c(0,0,1,1)
have<-as.data.frame(cbind(var1,want))


var1                 want
BLUE|RED              0
RED|BLUE              0
WHITE|BLACK|ORANGE    1
BLACK|WHITE|ORANGE    1

Solution

  • str_count can be used - count the number of | (metacharacter - so escape (\\) or specify as fixed, and then create a logical vector (> 1), convert the logical to binary (as.integer or +)

    library(stringr)
    have$want <- +(str_count(have$var1, fixed("|") ) > 1)