Search code examples
rfunctionapply

Remove column name pattern in multiple dataframes in R


I have >100 dataframes loaded into R with column name prefixes in some but not all columns that I would like to remove. In the below example with 3 dataframes, I would like to remove the pattern x__ in the 3 dataframes but keep all the dataframe names and everything else the same. How could this be done?

df1 <- data.frame(`x__a` = rep(3, 5), `x__b` = seq(1, 5, 1), `x__c` = letters[1:5])
df2 <- data.frame(`d` = rep(5, 5), `x__e` = seq(2, 6, 1), `f` = letters[6:10])
df3 <- data.frame(`x__g` = rep(5, 5), `x__h` = seq(2, 6, 1), `i` = letters[6:10])

Solution

  • You could put the data frames in a list and use an anonymous function with gsub.

    lst <- mget(ls(pattern='^df\\d$'))
    
    lapply(lst, \(x) setNames(x, gsub('x__', '', names(x))))
    # $df1
    #   a b c
    # 1 3 1 a
    # 2 3 2 b
    # 3 3 3 c
    # 4 3 4 d
    # 5 3 5 e
    # 
    # $df2
    #   d e f
    # 1 5 2 f
    # 2 5 3 g
    # 3 5 4 h
    # 4 5 5 i
    # 5 5 6 j
    # 
    # $df3
    #   g h i
    # 1 5 2 f
    # 2 5 3 g
    # 3 5 4 h
    # 4 5 5 i
    # 5 5 6 j
    

    If you have no use of the list, move the changed dfs back into .GlobalEnv using list2env, but I don't recommend it, since it overwrites.

    lapply(lst, \(x) setNames(x, gsub('x__', '', names(x)))) |> list2env(.GlobalEnv)