Search code examples
rdplyrprefix

Adding prefixes to some variables without touching others?


I'd like to produce a data frame like df3 from df1, ie adding a prefix (important_) to variables without one, whilst not touching variables with certain prefixes (gea_, win_, hea_). Thus far I've only managed something like df2 where the important_ variables end up in a separate dataframe, but I'd like all variables in the same data frame. Any thoughts on it would be much appreciated.

What I've got:

library(dplyr)

df1 <- data.frame("hea_income"=c(45000,23465,89522),"gea_property"=c(1,1,2) ,"win_state"=c("AB","CA","GA"), "education"=c(1,2,3), "commute"=c(13,32,1))

df2 <- df1 %>% select(-contains("gea_")) %>% select(-contains("win_")) %>% select(-contains("hea_"))  %>% setNames(paste0('important_', names(.)))

What I would like:

df3 <- data.frame("hea_income"=c(45000,23465,89522),"gea_property"=c(1,1,2) ,"win_state"=c("AB","CA","GA"), "important_education"=c(1,2,3), "important_commute"=c(13,32,1))

Solution

  • An option would be rename_at

    dfN <- df1 %>%
             rename_at(4:5, funs(paste0("important_", .)))
    identical(dfN, df3)
    #[1] TRUE
    

    We can also include some regex if we want to specify the variables not by numeric index. Here the assumption is that all those columns that doesn't already have a _

    df1 %>%
        rename_at(vars(matches("^[^_]*$")), funs(paste0("important_", .)))
    #   hea_income gea_property win_state important_education important_commute
    #1      45000            1        AB                   1                13
    #2      23465            1        CA                   2                32
    #3      89522            2        GA                   3                 1
    

    Or with matches and -

    df1 %>%
        rename_at(vars(-matches("_")), funs(paste0("important_", .)))
    #   hea_income gea_property win_state important_education important_commute
    #1      45000            1        AB                   1                13
    #2      23465            1        CA                   2                32
    #3      89522            2        GA                   3                 1
    

    All three solutions above get the expected output as showed in the OP's post