Search code examples
rdplyr

use dplyr to mutate columns which have numbers as names


In a dataframe, I have two columns (the names are numbers, due the table being the result of a pivot wider). I'd like to be able to use mutate using these columns but it can't make it work. Here is the data.

df <- structure(list(Age_Band = c("20-30", "20-30", "30-40", "30-40", 
                                  "40-50", "40-50", "50-60", "50-60"), Gender = c("female", "male", 
                                                                                  "female", "male", "female", "male", "female", "male"), `2012` = c(20881.9, 
                                                                                                                                                    19962.1, 29710.85, 26187.4, 50536.05, 21359.3, 26447.65, 24520.95
                                                                                  ), `2013` = c(20132.4, 17268.15, 22622.1, 21543.05, 39569.15, 
                                                                                                15701.1, 21450.95, 21020.6)), class = c("tbl_df", "tbl", "data.frame"
                                                                                                ), row.names = c(NA, -8L))

This is the code I'm trying to run. To create a new column with the difference.

df %>%
  mutate(difference = 2013 - 2012)

It doesn't work, I'm assuming because the column names are numbers? Is there another way to do it? Otherwise is there a way to rename them in dplyr, as I can't find a way to do that either. Any help greatly appreciated. Thanks.


Solution

  • I'd recommend renaming them, e.g using rename_with (or, if you can, avoid creating them in the first place)

    library(dplyr)
    
    df %>% 
      rename_with(~ if_else(grepl("^\\d+$", .x), paste0("var", .x), .x))
    # A tibble: 8 × 4
      Age_Band Gender var2012 var2013
      <chr>    <chr>    <dbl>   <dbl>
    1 20-30    female  20882.  20132.
    2 20-30    male    19962.  17268.
    3 30-40    female  29711.  22622.
    4 30-40    male    26187.  21543.
    5 40-50    female  50536.  39569.
    6 40-50    male    21359.  15701.
    7 50-60    female  26448.  21451.
    8 50-60    male    24521.  21021.
    

    Just sub works too

    df %>%
      rename_with(~ sub("^(\\d+)$", "var\\1", .x))
    

    If you have to use the numbers access them by treating them as a symbol by using

    • backticks `2012`
    • eval(as.name("2012"))
    • !!sym("2012")