Search code examples
rdataframelapplynames

lapply 2 functions in one command


I want to change some names in the data frame df

> names(df)[17:26]
[1] "X1."  "X2."  "X3."  "X4."  "X5."  "X6."  "X7."  "X8."  "X9."  "X10."

I want "X" -> "Reach" and remove dots. I used lapply:

change <- function(d){
    gsub("X","reach",d) 
    gsub("\\.","",d)
}
a <- as.character(lapply(names(df)[17:26], change))

But "X" didn't change. Why?

> a 
[1] "X1"  "X2"  "X3"  "X4"  "X5"  "X6"  "X7"  "X8"  "X9"  "X10"

Solution

  • You can do this in a single gsub using back references (the parenthesised parts of the pattern expression).

    x <- names(df)[17:26]
    gsub( "X([0-9]+)." , "Reach\\1" , x )
    # [1] "Reach1"  "Reach2"  "Reach3"  "Reach4"  "Reach5"  "Reach6"  "Reach7"  "Reach8"  "Reach9"  "Reach10"
    

    We match the digits in your names vector using [0-9]+ and by surrounding them in parentheses we make what is known as a back reference. We can refer to stuff matched inside the parentheses by its back reference. Since this is the first set of parentheses, it's back reference is \\1. If we had another set of braces we could refer to this as \\2. So we match the X, then some numbers and then the .. We replace it with Reach and the numbers that matched within parentheses by referring to the back reference using \\1.

    I hope this explanation makes sense! It isn't the clearest.