I have a dataframe df
which contains foods and their corresponding ingredients (df
is pasted at the end).
I am interested in which foods contain "Flour", "Water" or "Salt".
Using str_detect
you can determine whether the food contains one or more of these ingredients:
library(tidyverse)
strings_to_check <- c("Water", "Salt", "Flour")
df2 <- df %>%
mutate(Key_Ingredient = str_detect(Ingredients, paste(strings_to_check, collapse = "|")))
How can I go one step further and obtain the count of key ingredients used and return which of the key ingredients were used? Put another way, how do I count how many strings in the list of strings were detected and return the ones that were detected in a separate column of values?
The expected output is:
Food | Ingredients | Key_Ingredient | Key_Count | Key_Used |
---|---|---|---|---|
Appleberry Muffins | Flour, Vanilla Extract, Olive Oil, Milk, Garlic, Carrots, Chicken | TRUE | 2 | Flour, Salt |
Blue Moon Pancakes | Baking Powder, Garlic, Eggs, Ice, Sugar, Tofu, Rice | FALSE | 0 | NA |
Crystalized Starfruit | Milk, Beef, Tofu, Rice, Salt, Garlic, Mushrooms | TRUE | 1 | Salt |
Dragonfruit Delight | Rice, Milk, Pork, Yeast, Carrots, Tofu, Mushrooms | FALSE | 0 | NA |
Ethereal Eclairs | Pasta, Flour, Water, Mushrooms, Chicken, Vanilla Extract, Yeast | TRUE | 2 | Flour, Water |
Flaming Firefruit | Pepper, Yeast, Vanilla Extract, Sugar, Wheat, Olive Oil, Pork | FALSE | 0 | NA |
Glowing Grapes | Garlic, Nutmeg, Beef, Salt, Tofu, Onions, Baking Powder | TRUE | 1 | Salt |
Honeydew Haze | Salt, Water, Rice, Yeast, Flour, Honey, Mushrooms | TRUE | 2 | Water, Salt |
Iridescent Ice Cream | Water, Salt, Onions, Pasta, Spinach, Pork, Carrots | TRUE | 2 | Water, Salt |
Jellybean Jamboree | Salt, Eggs, Flour, Baking Powder, Water, Potatoes, Yeast | TRUE | 2 | Water, Salt |
Kiwi Kaleidoscope | Water, Honey, Salt, Potatoes, Vanilla Extract, Pork, Pasta | TRUE | 1 | Water |
Lunar Lemons | Salt, Tofu, Olive Oil, Baking Powder, Pork, Vanilla Extract, Cinnamon | TRUE | 1 | Salt |
Mystic Marshmallows | Salt, Flour, Onions, Water, Chicken, Eggs, Milk | TRUE | 2 | Flour, Water |
Nebula Noodles | Honey, Flour, Pork, Beef, Potatoes, Spinach, Chicken | TRUE | 1 | Flour |
Omega Oranges | Mushrooms, Water, Salt, Olive Oil, Spinach, Tofu, Potatoes | TRUE | 2 | Water, Salt |
Phantom Peaches | Wheat, Carrots, Baking Powder, Tofu, Eggs, Nutmeg, Potatoes | FALSE | 0 | NA |
Quasar Quince | Honey, Tomatoes, Vanilla Extract, Flour, Garlic, Butter, Salt | TRUE | 2 | Flour, Salt |
Radiant Raspberries | Salt, Yeast, Garlic, Rice, Sugar, Spinach, Baking Powder | TRUE | 1 | Salt |
Stellar Strawberries | Flour, Onions, Spinach, Pork, Yeast, Water, Potatoes | TRUE | 2 | Flour, Water |
Twilight Tangerines | Potatoes, Eggs, Kale, Beef, Spinach, Vanilla Extract, Milk | FALSE | 0 | NA |
Universal Ugli Fruit | Cinnamon, Yeast, Potatoes, Flour, Salt, Water, Garlic | TRUE | 2 | Water, Salt |
Vortex Veggies | Milk, Salt, Flour, Olive Oil, Garlic, Water, Spinach | TRUE | 2 | Water, Salt |
Whirlwind Walnuts | Salt, Flour, Beef, Garlic, Milk, Potatoes, Olive Oil | TRUE | 2 | Water, Salt |
Xenon Xacuti | Water, Salt, Yeast, Rice, Garlic, Vanilla Extract, Eggs | TRUE | 2 | Water, Salt |
Yellow Yams of Yore | Vanilla Extract, Garlic, Chestnuts, Baking Powder, Tofu, Carrots, Sugar | FALSE | 0 | NA |
Zephyr Zucchini | Pork, Honey, Baking Powder, Onions, Sugar, Yeast, Water | TRUE | 2 | Water, Salt |
The full data for df is:
df <- data.frame(
Food = c("Appleberry Muffins", "Blue Moon Pancakes", "Crystalized Starfruit",
"Dragonfruit Delight", "Ethereal Eclairs", "Flaming Firefruit",
"Glowing Grapes", "Honeydew Haze", "Iridescent Ice Cream",
"Jellybean Jamboree", "Kiwi Kaleidoscope", "Lunar Lemons",
"Mystic Marshmallows", "Nebula Noodles", "Omega Oranges",
"Phantom Peaches", "Quasar Quince", "Radiant Raspberries",
"Stellar Strawberries", "Twilight Tangerines", "Universal Ugli Fruit",
"Vortex Veggies", "Whirlwind Walnuts", "Xenon Xacuti",
"Yellow Yams of Yore", "Zephyr Zucchini"),
Ingredients = c("Flour, Vanilla Extract, Olive Oil, Milk, Garlic, Carrots, Chicken",
"Baking Powder, Garlic, Eggs, Ice, Sugar, Tofu, Rice",
"Milk, Beef, Tofu, Rice, Salt, Garlic, Mushrooms",
"Rice, Milk, Pork, Yeast, Carrots, Tofu, Mushrooms",
"Pasta, Flour, Water, Mushrooms, Chicken, Vanilla Extract, Yeast",
"Pepper, Yeast, Vanilla Extract, Sugar, Wheat, Olive Oil, Pork",
"Garlic, Nutmeg, Beef, Salt, Tofu, Onions, Baking Powder",
"Salt, Water, Rice, Yeast, Flour, Honey, Mushrooms",
"Water, Salt, Onions, Pasta, Spinach, Pork, Carrots",
"Salt, Eggs, Flour, Baking Powder, Water, Potatoes, Yeast",
"Water, Honey, Salt, Potatoes, Vanilla Extract, Pork, Pasta",
"Salt, Tofu, Olive Oil, Baking Powder, Pork, Vanilla Extract, Cinnamon",
"Salt, Flour, Onions, Water, Chicken, Eggs, Milk",
"Honey, Flour, Pork, Beef, Potatoes, Spinach, Chicken",
"Mushrooms, Water, Salt, Olive Oil, Spinach, Tofu, Potatoes",
"Wheat, Carrots, Baking Powder, Tofu, Eggs, Nutmeg, Potatoes",
"Honey, Tomatoes, Vanilla Extract, Flour, Garlic, Butter, Salt",
"Salt, Yeast, Garlic, Rice, Sugar, Spinach, Baking Powder",
"Flour, Onions, Spinach, Pork, Yeast, Water, Potatoes",
"Potatoes, Eggs, Kale, Beef, Spinach, Vanilla Extract, Milk",
"Cinnamon, Yeast, Potatoes, Flour, Salt, Water, Garlic",
"Milk, Salt, Flour, Olive Oil, Garlic, Water, Spinach",
"Salt, Flour, Beef, Garlic, Milk, Potatoes, Olive Oil",
"Water, Salt, Yeast, Rice, Garlic, Vanilla Extract, Eggs",
"Vanilla Extract, Garlic, Chestnuts, Baking Powder, Tofu, Carrots, Sugar",
"Pork, Honey, Baking Powder, Onions, Sugar, Yeast, Water")
)
You can use str_count and str_match_all functions.
df2 <- df %>%
mutate(Key_Ingredient = str_detect(Ingredients, paste(strings_to_check, collapse = "|"))) %>%
mutate(Key_Count=str_count(Ingredients,paste(strings_to_check,collapse="|"))) %>%
mutate(Key_Used=str_match_all(Ingredients,paste(strings_to_check,collapse="|")))