Search code examples
rsortinggroupingsubset

get top 3 values per row and keep the column index


Consider the following matrix:

 m <- structure(c(0.00165720273925865, 0.000414300684814661, 0.00126727268296249, 
0.0105768527770331, 0.00126727268296249, 0.00155972022518461, 
0.00046304194185168, 0.001291643311481, 0.0107961884336997, 0.00158409085370312, 
0.00114541954036995, 0.000438671313333171, 0.00141349645407355, 
0.0107230765481442, 0.00182779713888821, 0.00131601393999951, 
0.00046304194185168, 0.00124290205444398, 0.00984573392147784, 
0.00175468525333268, 0.00158409085370312, 0.000536153827407209, 
0.00173031462481417, 0.0100894402066629, 0.00119416079740697), .Dim = c(5L, 
5L), .Dimnames = list(c("M001_0.6", "M002_0.6", "M004_0.6", "M012_0.6", 
"M013_0.6"), NULL))

That looks like that:

                 [,1]         [,2]         [,3]         [,4]         [,5]
M001_0.6 0.0016572027 0.0015597202 0.0011454195 0.0013160139 0.0015840909
M002_0.6 0.0004143007 0.0004630419 0.0004386713 0.0004630419 0.0005361538
M004_0.6 0.0012672727 0.0012916433 0.0014134965 0.0012429021 0.0017303146
M012_0.6 0.0105768528 0.0107961884 0.0107230765 0.0098457339 0.0100894402
M013_0.6 0.0012672727 0.0015840909 0.0018277971 0.0017546853 0.0011941608

I would like to find the top 3 maximum values per row and keep the column index of where those maximum 3 values occur for every row. With this code it's possible to get top 3 but the column indices are discarded.

apply(m, 1,function(x) sort(x,decreasing = TRUE))

Solution

  • You can use order to get the indices, and head(x, 3) to get the top 3 values.

    head(apply(m, 1, function(x) order(x, decreasing = TRUE)), 3)
         M001_0.6 M002_0.6 M004_0.6 M012_0.6 M013_0.6
    [1,]        1        5        5        2        3
    [2,]        5        2        3        3        4
    [3,]        2        4        2        1        2
    

    Then you could get something like:

    list(index = t(head(apply(m, 1, function(x) order(x, decreasing = TRUE)), 3)),
         values = t(head(apply(m, 1,function(x) sort(x, decreasing = TRUE)), 3)))
    
    $index
             [,1] [,2] [,3]
    M001_0.6    1    5    2
    M002_0.6    5    2    4
    M004_0.6    5    3    2
    M012_0.6    2    3    1
    M013_0.6    3    4    2
    
    $values
                     [,1]         [,2]         [,3]
    M001_0.6 0.0016572027 0.0015840909 0.0015597202
    M002_0.6 0.0005361538 0.0004630419 0.0004630419
    M004_0.6 0.0017303146 0.0014134965 0.0012916433
    M012_0.6 0.0107961884 0.0107230765 0.0105768528
    M013_0.6 0.0018277971 0.0017546853 0.0015840909