Search code examples
rdplyrtidyrplyr

Change column names using *plyr where the mapping is given by two columns of another data frame


I have a simple data frame a

  x  y
1 1 11
2 2 22
3 3 33

and another one b

  old  new
1   x haha
2   y hoho

which gives the mapping of the old column names to new column names. I want the following data frame c.

  haha hoho
1    1   11
2    2   22
3    3   33

Note that the actual a has lots of columns and the mapping of the two columns in b are not straight forward. Also, the rows of b may not be in the same order as the columns of a.

Is it possible to do using plyr/dplyr? Something like this in python: Changing dataframe columns names by columns from another dataframe python?


Solution

  • This is a great opportunity to use !!:

    library(tidyverse)
    
    data <- tribble(
      ~x, ~y,
      1,  11,
      2,  22,
      3,  33
    )
    
    name_tbl <- tribble(
      ~old, ~new,
      "x",  "haha",
      "y",  "hoho"
    )
    
    (name_pairs <- with(name_tbl, set_names(old, new)))
    #> haha hoho 
    #>  "x"  "y"
    
    rename(data, !!name_pairs)
    #> # A tibble: 3 x 2
    #>    haha  hoho
    #>   <dbl> <dbl>
    #> 1     1    11
    #> 2     2    22
    #> 3     3    33
    

    Created on 2019-10-21 by the reprex package (v0.3.0)

    rename() uses name-value pairs (with the new names as the names), so we just need to 1) take the vector of old names, 2) give it the names of the new names, and 3) call rename() with the named vector, unquoted since we're passing the pairs as an object value rather than as syntax.