Search code examples
rdataframemax

Column index of maximum value by row (like a pmax index)


I have a data frame with the following data:

z = data.frame(date = strptime(c(20110101,20110102,20110103,20110104,20110105,20110106),
                               format = '%Y%m%d'),
               rate1=c(1,2,3,4,5,6),
               rate2=c(2,1,3,6,8,4),
               rate3=c(4,1,3,6,8,3),
               rate4=c(7,8,9,2,1,8))

Use pmax to get the maximum value of the 'rate columns' for each row:

z$max = pmax(rate1,rate2,rate3,rate4)

#         date rate1 rate2 rate3 rate4 max
# 1 2011-01-01     1     2     4     7   7
# 2 2011-01-02     2     1     1     8   8
# 3 2011-01-03     3     3     3     9   9
# 4 2011-01-04     4     6     6     2   6
# 5 2011-01-05     5     8     8     1   8
# 6 2011-01-06     6     4     3     8   8

The pmax function allows me to get the maximum value for each row, but I was wondering how I can get the index of the maximum value for that record.

Where z$max equal the maximum values c(7, 8, 9, 6, 8, 8), I would like to get the corresponding column indices c(5, 5, 5, 3, 3, 5)

Is this possible? I know this seems like something simple but I cannot find the answer anywhere.


Solution

  • Very simple in base R:

    z$wmax <- apply(z[, -c(1,6)],1, which.max)
    

    Actually that gives you 1 less than what you were asking for since I excluded the first column but that can easily be remedied by adding one.

    z$max_col_n <- apply(z[, -c(1,6)],1, which.max) +1