Search code examples
rstringcharacterstringi

How can I delete the last two elements from series characters (column names)?


In my data frame many column names end with ".y" as in the example:

dat <- data.frame(x1=sample(c(0:1)), id=sample(10), av1.y = sample(10) , av2.y = sample(10) , av3.y = sample(10),av4.y=sample(10))
dat

I would like to get rid of the last two characters of all the column names that end with .y and leave the others unchanged in order to have a data frame like this:

colnames(dat) <- c("x1","id","av1","av2","av3","av4")
dat

How can I achieve this without re-typing all the column names? I found a way to do it for a single string but don't know how to do it repeatedly over series of strings:

library(stringi)
stri_sub("av3.y",1,3)

Solution

  • One possibility is gsub:

    gsub(pattern = ".y", replacement = "", x = names(dat), fixed = TRUE)
    # [1] "x1"  "id"  "av1" "av2" "av3" "av4"
    

    More explicitly match of ".y" at the end of the string:

    gsub(pattern = "\\.y$", replacement = "", x = names(dat))