Search code examples
rdataframeheadernames

Can you set variables as a column name?


If I have a dataframe as follows, with a combination of text values and NA cells:

id Col1 Col2 Col3 Col4 Col5 Col...
id1 NA NA sample NA weight etc
id2 NA size NA NA NA etc
id3 volume size sample NA NA etc
id4 NA NA NA qty NA etc
id5 NA NA sample qty weight etc

Is it possible to rename the header with the most frequently used value in that column, as follows?

id volume size sample qty weight
id1 NA NA sample NA weight
id2 NA size NA NA NA
id3 volume size sample NA NA
id4 NA NA NA qty NA
id5 NA NA sample qty weight

Solution

  • Try the code below

    > cbind(df[1], setNames(df[-1], sapply(df[-1], function(x) unique(na.omit(x)))))
       id volume size sample  qty weight
    1 id1   <NA> <NA> sample <NA> weight
    2 id2   <NA> size   <NA> <NA>   <NA>
    3 id3 volume size sample <NA>   <NA>
    4 id4   <NA> <NA>   <NA>  qty   <NA>
    5 id5   <NA> <NA> sample  qty weight
    

    Data

    > dput(df)
    structure(list(id = c("id1", "id2", "id3", "id4", "id5"), Col1 = c(NA,
    NA, "volume", NA, NA), Col2 = c(NA, "size", "size", NA, NA),
        Col3 = c("sample", NA, "sample", NA, "sample"), Col4 = c(NA,
        NA, NA, "qty", "qty"), Col5 = c("weight", NA, NA, NA, "weight"
        )), class = "data.frame", row.names = c(NA, -5L))