Search code examples
rjoindplyrmutate

Replacing values in one table from a corresponding key in another table by specific column


I am processing a large dataset from a questionnaire that contains coded responses in some but not all columns. I would like to replace the coded responses with actual values. The key/dictionary is stored in another database. The complicating factor is that different questions (stored as columns in original dataset) used the same code (typically numeric), but the code has different meanings depending on the column (question).

How can I replace the coded values in the original dataset with different valuse from a corresponding key stored in the dictionary table, but do it by specific column name (also stored in the dictionary table)?

Below is an example of the original dataset and the dictionary table, as well as desired result.

original <- data.frame(
  name = c('Jane','Mary','John', 'Billy'),
  home = c(1,3,4,2),
  car = c('b','b','a','b'),
  shirt = c(3,2,1,1),
  shoes = c('Black','Black','Black','Brown')
  )

keymap <- data.frame(
  column_name=c('home','home','home','home','car','car','shirt','shirt','shirt'),
  value_old=c('1','2','3','4','a','b','1','2','3'),
  value_new=c('Single family','Duplex','Condo','Apartment','Sedan','SUV','White','Red','Blue')
)

result <- data.frame(
  name = c('Jane','Mary','John', 'Billy'),
  home = c('Single family','Condo','Apartment','Duplex'),
  car = c('SUV','SUV','Sedan','SUV'),
  shirt = c('Blue','Red','White','White'),
  shoes = c('Black','Black','Black','Brown')
)

> original
   name home car shirt shoes
1  Jane    1   b     3 Black
2  Mary    3   b     2 Black
3  John    4   a     1 Black
4 Billy    2   b     1 Brown


> keymap
  column_name value_old     value_new
1        home         1 Single family
2        home         2        Duplex
3        home         3         Condo
4        home         4     Apartment
5         car         a         Sedan
6         car         b           SUV
7       shirt         1         White
8       shirt         2           Red
9       shirt         3          Blue


> result
   name          home   car shirt shoes
1  Jane Single family   SUV  Blue Black
2  Mary         Condo   SUV   Red Black
3  John     Apartment Sedan White Black
4 Billy        Duplex   SUV White Brown

I have tried different approaches using dplyr but have not gotten far as I do not have a robust understanding of the mutate/join syntax.


Solution

  • We may loop across the unique values from the 'column_name' column of 'keymap' in the original, subset the keymap that matches the column name (cur_column()), select the columns 2 and 3, deframe to a named vector and match with the values of the column for replacement

    library(dplyr)
    library(tibble)
    original %>% 
       mutate(across(all_of(unique(keymap$column_name)), ~ 
       (keymap %>%
                filter(column_name == cur_column()) %>%
                select(-column_name) %>%
                deframe)[as.character(.x)]))
    

    -output

       name          home   car shirt shoes
    1  Jane Single family   SUV  Blue Black
    2  Mary         Condo   SUV   Red Black
    3  John     Apartment Sedan White Black
    4 Billy        Duplex   SUV White Brown
    

    Or an approach in base R

    lst1 <- split(with(keymap, setNames(value_new, value_old)), keymap$column_name)
    original[names(lst1)] <- Map(\(x, y) y[as.character(x)], 
           original[names(lst1)], lst1)