Search code examples
rdplyrpurrr

Mutate in dplyr / purrr using an external table / dataframe


I want to use a separate dataframe to perform a mutate on test as follows:

library(tidyverse)

test <- tribble(
  ~col1, ~col2, ~col3,
  "one", "a", "b",
  "two", "b", "c",
  "three", "c", "d"
)

mutator <- tribble(
  ~mutate_from, ~mutate_to,
  "a", "X",
  "b", "X"
)

Desired output:

> test
# A tibble: 3 x 3
  col1  col2  col3 
  <chr> <chr> <chr>
1 one   a     X    
2 two   b     X    
3 three c     d   

I am stuck as to how I can implement this:

test %>% 
  mutate(col3 = map(mutator, ~.x %>% ?))

The end goal is to probably have 30 or so observations in mutator, so I am looking to functionise this somehow.


Solution

  • I am not sure if this is programmatic enough for your purposes, but you could just update the column:

    library(dplyr)
    
    test |>
      rows_update(rename(mutator, col2 = mutate_from, col3 = mutate_to), by = "col2")
    #   col1  col2  col3 
    #   <chr> <chr> <chr>
    # 1 one   a     X    
    # 2 two   b     X    
    # 3 three c     d  
    

    Or do a left-join:

    left_join(test, mutator, by = c(col2 = "mutate_from")) |>
      mutate(col3 = coalesce(mutate_to, col3), .keep = "unused")
    

    Or use match() to get the row index where there is a match:

    test |>
      mutate(col3 = coalesce(mutator$mutate_to[match(col2, mutator$mutate_from)], col3))
    

    If you wanted to do this across multiple columns then you could convert mutator into a named vector:

    lookup <- pull(mutator, mutate_to, mutate_from)
    
    test |>
      mutate(across(col3, \(x) ifelse(col2 %in% names(lookup), lookup[col2], x)))