Search code examples
r

For each row return the column name of the largest value


I have a roster of employees, and I need to know at what department they are in most often. It is trivial to tabulate employee ID against department name, but it is trickier to return the department name, rather than the number of roster counts, from the frequency table. A simple example below (column names = departments, row names = employee ids).

DF <- matrix(sample(1:9,9),ncol=3,nrow=3)
DF <- as.data.frame.matrix(DF)
> DF
  V1 V2 V3
1  2  7  9
2  8  3  6
3  1  5  4

Now how do I get

> DF2
  RE
1 V3
2 V1
3 V2

Solution

  • One option using your data (for future reference, use set.seed() to make examples using sample reproducible):

    DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
    
    colnames(DF)[apply(DF,1,which.max)]
    [1] "V3" "V1" "V2"
    

    A faster solution than using apply might be max.col:

    colnames(DF)[max.col(DF,ties.method="first")]
    #[1] "V3" "V1" "V2"
    

    ...where ties.method can be any of "random" "first" or "last"

    This of course causes issues if you happen to have two columns which are equal to the maximum. I'm not sure what you want to do in that instance as you will have more than one result for some rows. E.g.:

    DF <- data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(7,6,4))
    apply(DF,1,function(x) which(x==max(x)))
    
    [[1]]
    V2 V3 
     2  3 
    
    [[2]]
    V1 
     1 
    
    [[3]]
    V2 
     2