Group by using str_detect for groups with similar strings

Consider this example data:

library(tidyverse)

dt <- tibble(Poison = c('Arsenic', 'Arsenic in Wine', 'Cyanide', 'Cyanide and Sugar'),
             Result = c('Death', 'Death With Class', 'Death', 'Death'))

I want to create a column that gives each group an identification number. However, I want the poisons to be grouped together by a string detection, i.e., 'Arsenic' and 'Arsenic in Wine' to be one group and 'Cyanide' and 'Cyanide and Sugar' to be another group. Currently, R thinks that each group is it's own, as such:

dt <- dt %>%
  group_by(Poison) %>%
  mutate(Group = n())

# A tibble: 4 × 3
# Groups:   Poison [4]
  Poison            Result           Group
  <chr>             <chr>            <int>
1 Arsenic           Death                1
2 Arsenic in Wine   Death With Class     1
3 Cyanide           Death                1
4 Cyanide and Sugar Death                1

I want it to be so that 'Arsenic' and 'Arsenic in Wine' is Group 1, and 'Cyanide', and 'Cyanide and Sugar' is Group 2. Any ideas?

Solution

A combination of case_when and grepl could be useful:

dt %>% 
  mutate(Group = case_when(
    grepl("Arsenic", Poison) ~ 1,
    grepl("Cyanide", Poison) ~ 2
  ))
# A tibble: 4 × 3
  Poison            Result           Group
  <chr>             <chr>            <dbl>
1 Arsenic           Death                1
2 Arsenic in Wine   Death With Class     1
3 Cyanide           Death                2
4 Cyanide and Sugar Death                2

If you don't want to write down any poisson, this could be useful:

dt %>% 
  mutate(Group = sub(" .*", "", Poison) %>% 
           as.factor %>% 
           as.integer())