Search code examples
rdata-cleaningdata-wranglinggreplmutate

Mutate with multiple conditionals


I'm trying to create a mutate a new column called category with each of the product name classifications. And also I would like to mutate a column wherein caffeine_mg>0 has caffeine, but caffeine_mg=0 has no caffeine.

sbucks <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-12-21/starbucks.csv')
sbucks_new <- sbucks %>% 
  mutate(category = case_when(grepl("Tea", product_name) ~ "Tea",
                              grepl(c("coffee","Coffee","Caffè"), product_name) ~ "Coffee",
                              grepl("Smoothie", product_name) ~ "Smoothie",
                              grepl("Hot Chocolate", product_name) ~ "Hot Chocolate",
                              grepl("Espresso", product_name) ~ "Espresso",
                              grepl("Refreshers", product_name) ~ "Refreshers",
                              TRUE ~ "Others")) %>% 
  mutate(Caffeine = case_when(caffeine_mg=0 ~ "No",
                              caffeine_mg>0 ~ "Yes"))

Solution

  • There were several syntax errors in the OP's code:

    • In the second call to grep, the vector of patterns should be collapsed into a single character regex as with xxx|yyy|zzz.

    • In the second mutate, please see that the equality comparator would be == and not =.

    bucks %>% 
        mutate(category = case_when(grepl("Tea", product_name) ~ "Tea",
                                    grepl("coffee|Coffee|Caffè", product_name) ~ "Coffee",
                                    grepl("Smoothie", product_name) ~ "Smoothie",
                                    grepl("Hot Chocolate", product_name) ~ "Hot Chocolate",
                                    grepl("Espresso", product_name) ~ "Espresso",
                                    grepl("Refreshers", product_name) ~ "Refreshers",
                                    TRUE ~ "Others")) %>% 
        mutate(Caffeine = case_when(caffeine_mg == 0 ~ "No",
                                    caffeine_mg >0 ~ "Yes" ))
    
    # A tibble: 1,147 × 17
       product_name                           size    milk  whip serv_size_…¹ calor…² total…³ satur…⁴ trans…⁵ chole…⁶ sodiu…⁷ total…⁸ fiber_g sugar_g caffe…⁹ categ…˟ Caffe…˟
       <chr>                                  <chr>  <dbl> <dbl>        <dbl>   <dbl>   <dbl>   <dbl> <chr>     <dbl>   <dbl>   <dbl> <chr>     <dbl>   <dbl> <chr>   <chr>  
     1 brewed coffee - dark roast             short      0     0          236       3     0.1       0 0             0       5       0 0             0     130 Coffee  Yes    
     2 brewed coffee - dark roast             tall       0     0          354       4     0.1       0 0             0      10       0 0             0     193 Coffee  Yes    
     3 brewed coffee - dark roast             grande     0     0          473       5     0.1       0 0             0      10       0 0             0     260 Coffee  Yes    
     4 brewed coffee - dark roast             venti      0     0          591       5     0.1       0 0             0      10       0 0             0     340 Coffee  Yes    
     5 brewed coffee - decaf pike place roast short      0     0          236       3     0.1       0 0             0       5       0 0             0      15 Coffee  Yes    
     6 brewed coffee - decaf pike place roast tall       0     0          354       4     0.1       0 0             0      10       0 0             0      20 Coffee  Yes    
     7 brewed coffee - decaf pike place roast grande     0     0          473       5     0.1       0 0             0      10       0 0             0      25 Coffee  Yes    
     8 brewed coffee - decaf pike place roast venti      0     0          591       5     0.1       0 0             0      10       0 0             0      30 Coffee  Yes    
     9 brewed coffee - medium roast           short      0     0          236       3     0.1       0 0             0       5       0 0             0     155 Coffee  Yes    
    10 brewed coffee - medium roast           tall       0     0          354       4     0.1       0 0             0       5       0 0             0     235 Coffee  Yes    
    # … with 1,137 more rows, and abbreviated variable names ¹​serv_size_m_l, ²​calories, ³​total_fat_g, ⁴​saturated_fat_g, ⁵​trans_fat_g, ⁶​cholesterol_mg, ⁷​sodium_mg,
    #   ⁸​total_carbs_g, ⁹​caffeine_mg, ˟​category, ˟​Caffeine
    # ℹ Use `print(n = ...)` to see more rows