Search code examples
rdplyrtidy

Can't rename columns with dplyr


I'm trying to rename columns in data frame from Characteristics..genotype. to genotype and from Characteristics..age. to age:

pData(raw_data) %>%
  rename(
    age = Characteristics..age.,
    genotype = Characteristics..genotype.
  )

I get the following error:

Error in rename(., age = Characteristics..age., genotype = Characteristics..genotype.) : object 'Characteristics..age.' not found

Which doesn't make sense since columns exist in the data frame:

pData(raw_data)$Characteristics..genotype.

Output of the above:

[1] N171-HD82Q N171-HD82Q N171-HD82Q wt wt wt N171-HD82Q N171-HD82Q N171-HD82Q wt wt
[12] wt N171-HD82Q N171-HD82Q N171-HD82Q wt wt wt
Levels: N171-HD82Q wt

What am I missing?


Solution

  • An option would be backquotes

    library(dplyr)
     pData(raw_data) %>%
      rename(
        age = `Characteristics..age.`,
       genotype = `Characteristics..genotype.`
      )   
    

    Or based on the error (reproduced with plyr::rename), it would be better to use :: to specify the package from which it loads to avoid masking

    pData(raw_data) %>%
      dplyr::rename(
        age = Characteristics..age.,
       genotype = Characteristics..genotype.
      )   
    

    But, while testing on dplyr_0.8.3, it is working fine without backquotes a well

    data(mtcars)
    raw_data <- head(mtcars)
    names(raw_data)[1] <- "Characteristics..genotype."
    raw_data %>%
          dplyr::rename(genotype = Characteristics..genotype.)
    #             genotype cyl disp  hp drat    wt  qsec vs am gear carb
    # ...
    

    The issue would be that plyr also include the same rename function, so if the package was also loaded, it could mask the dplyr::rename

    raw_data %>% 
        plyr::rename(genotype = Characteristics..genotype.)
    

    Error in plyr::rename(., genotype = Characteristics..genotype.) :
    unused argument (genotype = Characteristics..genotype.)