Search code examples
rdataframetibble

Change Rows Into Column Names Using R


I'm trying to make column names from the rows with date. Take the following dataset, for instance:

# create data frame
df <- data.frame(student=c('A', 'B', 'C', 'D', 'E'),
                scores=c('May, 30', 2022, 31, 39, 35))
# glimpse data
df   

  student  scores
1       A May, 30
2       B    2022
3       C      31
4       D      39
5       E      35                                                                                ​               ​        ​

I want to change the rows 1 and 2 from score column and changed them into month_year format and then remove the entire rows. I'm trying the following script to get the column names but getting bizarre results:

colnames(df) <- df[2,]
df <- df[-2,]

Desired Output

  student  may_2022
1       C      31
2       D      39
3       E      35

What would be the ideal way of getting the desired output? Any suggestions would be appreciated. Thanks!


Solution

  • If this is the way your data are truly imported, as a generalizable approach you could try getting the month from the first row using sub and then pasting with the year from the second row.

    names(df)[2] <- paste0(sub("[^[:alpha:]]+", "", df$scores[1]), "_",df$scores[2])
    df <- df[-c(1:2),]
    

    Output:

    #   student May_2022
    # 3       C       31
    # 4       D       39
    # 5       E       35