Search code examples
rdplyr

Str_detect multiple columns using across


I would like to create a new column based on the results of str_detect across multiple columns using across.

For example, in the test data below, I'd like to search for "No job" across columns that start with "job", then return 1 if that string is detected in any of the columns, and 0 if it is not.

test_data <-  data.frame("job1" = c('Sales','Baker','Blacksmith','Brewer'), 
                         "job2" = c('Mailman','Jockey','Jobhunter',"No job"),
                         "id" = c("id_1", "id_2", "id_3", "id_4"))

# Output I'd like:

#         job1      job2   id no_job
#1      Sales   Mailman id_1      0
#2      Baker    Jockey id_2      0
#3 Blacksmith Jobhunter id_3      0
#4     Brewer    No job id_4      1

I know I could unite the columns that start with "job", and then just use str_detect on that new column like this:

test_data2 <- test_data %>%
    unite(col = "all_jobs", starts_with("job"), sep = ", ", remove = FALSE) %>%
    mutate(no_job = if_else(str_detect(all_jobs, "No job"), 1, 0))

... but I was wondering if there was a way to use across to do the same thing. I'd tried variations on the following but haven't gotten it to work.

test_data2 <- test_data %>%
    mutate(no_job = if_else(across(starts_with("job"), str_detect(., "No job")), 1, 0))

Solution

  • One option could be:

    test_data %>%
     rowwise() %>%
     mutate(no_job = +any(str_detect(c_across(-id), "No job")))
    
      job1       job2      id    no_job
      <fct>      <fct>     <fct>  <int>
    1 Sales      Mailman   id_1       0
    2 Baker      Jockey    id_2       0
    3 Blacksmith Jobhunter id_3       0
    4 Brewer     No job    id_4       1