Search code examples
rdplyrcase-whengrepl

Use multiple expression in grepl string to create new data


I would like to create a new column based on the string in another column in my data frame. Using case_when and grepl I would like to make a new class if string contains value1 OR value2. But, not sure how to put there this 'OR' statement?

Dummy example (not working):

df <- data.frame(type = "a_m5", "a_m20", "a_5")

df %>% 
  mutate(modif = case_when(
    grepl('_m5'|'_m20', type) ~ 'less', # how to specify here | OR symbol in R? 
    grepl('_5', type) ~ 'more'))

Of course, it works if I specify the statements one by one (working):

df %>% 
  mutate(modif = case_when(
    grepl('_m5', type) ~ 'less',
    grepl('_m20', type) ~ 'less',
    grepl('_5', type) ~ 'more'))

but I wonder how to make it one line, as I have multiple options? maybe something like chars %in% vector would work here?

Desired output:

   type modif
1  a_m5  less
2 a_m20  less
3   a_5  more

Solution

  • You may use %in% here:

    df %>% 
        mutate(modif = case_when(
            type %in% c('_m5', '_m20') ~ 'less',
            TRUE ~ 'more')
        )
    

    You could also keep one value on the LHS per predicate, to keep it consistent:

    df %>% 
        mutate(modif = case_when(
            '_m5' ~ 'less',
            '_m20' ~ 'less',
            TRUE ~ 'more')
        )
    

    Note that I don't bother explicitly checking for a _5 value, assuming that it would be the only other possible value.