Search code examples
rfunctionloopsdynamicvariable-names

R: How do I loop through a list of variables and create new ones with dynamic names?


I've been searching for hours now and I can't find a good example for what I want to do. I know how to do this in SAS easily but I'm newer to R. I have the code below that creates two variables. I have to repeat this to create dozens more like it. The pattern is the same. All new variables will start with "EPL_" and will be created using a variable called "EP_" with the same suffix. Examples:

backtowide$EPL_VARTREE <-  percent_rank(backtowide$EP_VARTREE)
backtowide$EPL_VARSKY <-  percent_rank(backtowide$EP_VARSKY)

How can I do this in a loop without having to repeat this line of code 20+ times?

Something like:

for i in VARLIST {
backwide$EPL_i <-  percent_rank(backtowide$EP_i)
}

Solution

  • You can use across to apply percent_rank function to every column that starts with 'EP_' and use .names to assign names.

    library(dplyr)
    
    backtowide <- backtowide %>%
                     mutate(across(starts_with('EP'), 
                            percent_rank, .names = '{sub("EP", "EPL", .col)}'))
    backtowide
    
    #   EP_VARTREE  EP_VARSKY EPL_VARTREE EPL_VARSKY
    #1 -0.56047565  1.7150650        0.00       1.00
    #2 -0.23017749  0.4609162        0.25       0.75
    #3  1.55870831 -1.2650612        1.00       0.00
    #4  0.07050839 -0.6868529        0.50       0.25
    #5  0.12928774 -0.4456620        0.75       0.50
    

    Or with lapply -

    cols <- grep('^EP_', names(backtowide), value = TRUE)
    backtowide[sub("EP", "EPL", cols)] <- lapply(backtowide[cols], percent_rank)
    

    data

    set.seed(123)
    backtowide <- data.frame(EP_VARTREE = rnorm(5), EP_VARSKY = rnorm(5))