I want to change some names in the data frame df
> names(df)[17:26]
[1] "X1." "X2." "X3." "X4." "X5." "X6." "X7." "X8." "X9." "X10."
I want "X" -> "Reach" and remove dots. I used lapply:
change <- function(d){
gsub("X","reach",d)
gsub("\\.","",d)
}
a <- as.character(lapply(names(df)[17:26], change))
But "X" didn't change. Why?
> a
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8" "X9" "X10"
You can do this in a single gsub
using back references (the parenthesised parts of the pattern expression).
x <- names(df)[17:26]
gsub( "X([0-9]+)." , "Reach\\1" , x )
# [1] "Reach1" "Reach2" "Reach3" "Reach4" "Reach5" "Reach6" "Reach7" "Reach8" "Reach9" "Reach10"
We match the digits in your names vector using [0-9]+
and by surrounding them in parentheses we make what is known as a back reference. We can refer to stuff matched inside the parentheses by its back reference. Since this is the first set of parentheses, it's back reference is \\1
. If we had another set of braces we could refer to this as \\2
. So we match the X
, then some numbers and then the .
. We replace it with Reach
and the numbers that matched within parentheses by referring to the back reference using \\1
.
I hope this explanation makes sense! It isn't the clearest.