Search code examples
rdplyrdata-management

Leverage variable values/types to rename columns


I have searched stackoverflow for an answer to this question but have had no luck so far. My request:

For a simple dataframe:

# create dummy data.frame
d <- data.frame(var1 = as.character(1:5), 
                var2 = factor(letters[1:5]), 
                var3 = 1:5, stringsAsFactors = FALSE,
                var4 = rep("<div class='The mink cat'< /p>'",5))

How do I go about renaming the variables based on the variable type / string content without referencing the original column name (e.g., d[,1] or var1)?

Rename var1-var3 to character, factor and numeric given by the value type in each variable.

Rename var4 as 'mink' by a search of the variable values and leveraging part of the string: 'the mink cat'.


Solution

  • As we commented earlier, the solution would be to find the class of the columns with sapply, then name the columns with that object ('nm1') and finally, unlist the first row, get the index of columns that have 'mink' and assign the column names to 'mink'

    nm1 <- sapply(d, class)
    names(d) <- nm1
    names(d)[grep("mink", unlist(d[1,]))] <- "mink"
    d
    #  character factor integer                            mink
    #1         1      a       1 <div class='The mink cat'< /p>'
    #2         2      b       2 <div class='The mink cat'< /p>'
    #3         3      c       3 <div class='The mink cat'< /p>'
    #4         4      d       4 <div class='The mink cat'< /p>'
    #5         5      e       5 <div class='The mink cat'< /p>'