Search code examples
r

Variable as a column name in data frame


Is there any way to use string stored in variable as a column name in a new data frame? The expected result should be:

col.name <- 'col1'
df <- data.frame(col.name=1:4)
print(df)

# Real output
  col.name
1        1
2        2
3        3
4        4

# Expected output
  col1
1    1
2    2
3    3
4    4

I'm aware that I can create data frame and then use names() to rename column or use df[, col.name] for existing object, but I'd like to know if there is any other solution which could be used during creating data frame.


Solution

  • You cannot pass a variable into the name of an argument like that.

    Instead what you can do is:

    df <- data.frame(placeholder_name = 1:4)
    names(df)[names(df) == "placeholder_name"] <- col.name
    

    or use the default name of "V1":

    df <- data.frame(1:4)
    names(df)[names(df) == "V1"] <- col.name
    

    or assign by position:

    df <- data.frame(1:4)
    names(df)[1] <- col.name
    

    or if you only have one column just replace the entire names attribute:

    df <- data.frame(1:4)
    names(df) <- col.name
    

    There's also the set_names function in the magrittr package that you can use to do this last solution in one step:

    library(magrittr)
    df <- set_names(data.frame(1:4), col.name)
    

    But set_names is just an alias for:

    df <- `names<-`(data.frame(1:4), col.name)
    

    which is part of base R. Figuring out why this expression works and makes sense will be a good exercise.