I have a multipart lookup table problem in R. I have a data frame, where the number in each column represents an item name. The item name can be found in the corresponding look up table.
Data:
> food.dat
Fruit Vegetable Meat Dairy
1 1 2 2 3
2 3 2 1 1
3 3 2 2 2
4 2 2 1 1
5 1 1 1 2
Lookup Table:
> food.lookup
FoodItem Number FoodName
1 Fruit 1 Banana
2 Fruit 2 Apple
3 Fruit 3 Mango
4 Vegetable 1 Carrot
5 Vegetable 2 Broccoli
6 Meat 1 Chicken
7 Meat 2 Fish
8 Dairy 1 Cheese
9 Dairy 2 Yogurt
10 Dairy 3 IceCream
Note that the number is not unique amongst foods. For example, a 1 represents a different FoodName in column Fruit (Banana) and a different FoodName in column Vegetable (Carrot).
I would like to recode the food.dat dataframe to have the FoodName value from the lookup table. If possible I would also like to be able to use a simple function and supply a FoodName and return a dataframe from food.dat which includes only rows that include that specified FoodName.
Thank you for your time and thoughts :)
split
the named vector
by the 'FoodItem' into a list
from the 'food.lookup'. Loop across
the 'food.dat' columns, extract the list
element and replace the values by matching
library(dplyr)
lst1 <- with(food.lookup, split(setNames(FoodName, Number), FoodItem))
food.dat %>%
mutate(across(all_of(names(lst1)), ~ lst1[[cur_column()]][as.character(.)]))
-output
Fruit Vegetable Meat Dairy
1 Banana Broccoli Fish IceCream
2 Mango Broccoli Chicken Cheese
3 Mango Broccoli Fish Yogurt
4 Apple Broccoli Chicken Cheese
5 Banana Carrot Chicken Yogurt
food.dat <- structure(list(Fruit = c(1L, 3L, 3L, 2L, 1L), Vegetable = c(2L,
2L, 2L, 2L, 1L), Meat = c(2L, 1L, 2L, 1L, 1L), Dairy = c(3L,
1L, 2L, 1L, 2L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5"))
food.lookup <- structure(list(FoodItem = c("Fruit", "Fruit",
"Fruit", "Vegetable",
"Vegetable", "Meat", "Meat", "Dairy", "Dairy", "Dairy"), Number = c(1L,
2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L), FoodName = c("Banana", "Apple",
"Mango", "Carrot", "Broccoli", "Chicken", "Fish", "Cheese", "Yogurt",
"IceCream")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10"))