Search code examples
rstringif-statementcontainsmutated

Mutate variable if certain columns contain string in R


I have been struggling for hours with this dataset. I have searched for hours and tried many things, but I failed (I am a novice in R). So I really hope you guys can help me.

I have this dataset:

      df <- data.frame(ID = c(1,2,3,4,5), a.1 = c("A", "C", "C", "B","D"), a.2 = c("C", "C", "D", "A","B"), b.1 = c("D", "C", "A", "B","D"), b.2 = c("D", "B", "C", "A","A"))
    
  ID a.1 a.2 b.1 b.2
1  1   A   C   D   D
2  2   C   C   C   B
3  3   C   D   A   C
4  4   B   A   B   A
5  5   D   B   D   A

I would like to mutate a new variable called "result" to be:

  • "1" if one of the columns with prefix "a." contain "A" or "B"
  • "0" if one of the columns with prefix "a." do not contain "A" or "B"

So I would get the following result:

  ID a.1 a.2 b.1 b.2 result
1  1   A   C   D   D      1
2  2   C   C   C   B      0
3  3   C   D   A   C      0
4  4   B   A   B   A      1
5  5   D   B   D   A      1

In my real dataset I have 100 variables with prefix "a.", so I cannot select the columns individually.

Hopefully you guys can help me!

Thank you very much!


Solution

  • library(dplyr)
    
    df %>% 
      rowwise() %>% 
      mutate(res = any(c_across(starts_with("a.")) %in% c("A", "B")) * 1L)
    
    #> # A tibble: 5 x 6
    #> # Rowwise: 
    #>      ID a.1   a.2   b.1   b.2     res
    #>   <dbl> <chr> <chr> <chr> <chr> <int>
    #> 1     1 A     C     D     D         1
    #> 2     2 C     C     C     B         0
    #> 3     3 C     D     A     C         0
    #> 4     4 B     A     B     A         1
    #> 5     5 D     B     D     A         1