I'd like to produce a data frame like df3 from df1, ie adding a prefix (important_) to variables without one, whilst not touching variables with certain prefixes (gea_, win_, hea_). Thus far I've only managed something like df2 where the important_ variables end up in a separate dataframe, but I'd like all variables in the same data frame. Any thoughts on it would be much appreciated.
What I've got:
library(dplyr)
df1 <- data.frame("hea_income"=c(45000,23465,89522),"gea_property"=c(1,1,2) ,"win_state"=c("AB","CA","GA"), "education"=c(1,2,3), "commute"=c(13,32,1))
df2 <- df1 %>% select(-contains("gea_")) %>% select(-contains("win_")) %>% select(-contains("hea_")) %>% setNames(paste0('important_', names(.)))
What I would like:
df3 <- data.frame("hea_income"=c(45000,23465,89522),"gea_property"=c(1,1,2) ,"win_state"=c("AB","CA","GA"), "important_education"=c(1,2,3), "important_commute"=c(13,32,1))
An option would be rename_at
dfN <- df1 %>%
rename_at(4:5, funs(paste0("important_", .)))
identical(dfN, df3)
#[1] TRUE
We can also include some regex if we want to specify the variables not by numeric index. Here the assumption is that all those columns that doesn't already have a _
df1 %>%
rename_at(vars(matches("^[^_]*$")), funs(paste0("important_", .)))
# hea_income gea_property win_state important_education important_commute
#1 45000 1 AB 1 13
#2 23465 1 CA 2 32
#3 89522 2 GA 3 1
Or with matches
and -
df1 %>%
rename_at(vars(-matches("_")), funs(paste0("important_", .)))
# hea_income gea_property win_state important_education important_commute
#1 45000 1 AB 1 13
#2 23465 1 CA 2 32
#3 89522 2 GA 3 1
All three solutions above get the expected output as showed in the OP's post