Search code examples
rdataframerowpercentage

calculate the percentage of times the highest value of a row corresponds in each variable


I have a data frame in R as follows:

set.seed(123)
    A <- as.data.frame(matrix(rnorm(20 * 5, mean = 0, sd = 1), 20, 5))

which results in:

> A
            V1          V2          V3          V4           V5
1  -0.56047565 -1.06782371 -0.69470698  0.37963948  0.005764186
2  -0.23017749 -0.21797491 -0.20791728 -0.50232345  0.385280401
3   1.55870831 -1.02600445 -1.26539635 -0.33320738 -0.370660032
4   0.07050839 -0.72889123  2.16895597 -1.01857538  0.644376549
5   0.12928774 -0.62503927  1.20796200 -1.07179123 -0.220486562
6   1.71506499 -1.68669331 -1.12310858  0.30352864  0.331781964
7   0.46091621  0.83778704 -0.40288484  0.44820978  1.096839013
8  -1.26506123  0.15337312 -0.46665535  0.05300423  0.435181491
9  -0.68685285 -1.13813694  0.77996512  0.92226747 -0.325931586
10 -0.44566197  1.25381492 -0.08336907  2.05008469  1.148807618
11  1.22408180  0.42646422  0.25331851 -0.49103117  0.993503856
12  0.35981383 -0.29507148 -0.02854676 -2.30916888  0.548396960
13  0.40077145  0.89512566 -0.04287046  1.00573852  0.238731735
14  0.11068272  0.87813349  1.36860228 -0.70920076 -0.627906076
15 -0.55584113  0.82158108 -0.22577099 -0.68800862  1.360652449
16  1.78691314  0.68864025  1.51647060  1.02557137 -0.600259587
17  0.49785048  0.55391765 -1.54875280 -0.28477301  2.187332993
18 -1.96661716 -0.06191171  0.58461375 -1.22071771  1.532610626
19  0.70135590 -0.30596266  0.12385424  0.18130348 -0.235700359
20 -0.47279141 -0.38047100  0.21594157 -0.13889136 -1.026420900

I want to find in each row the location of the highest value and display the percentage of times that the highest value was in the specific column. i.e.,

V1  V2  V3  V4  V5
2%  25% 40% 30% 3%

How can I calculate this in R?


Solution

  • max.col and table:

    max.col(A)
    #  [1] 4 5 1 3 3 1 5 5 4 4 1 5 4 3 5 1 5 5 1 3
    table(max.col(A))
    # 1 3 4 5 
    # 5 4 4 7 
    table(names(A)[max.col(A)])/nrow(A)
    #   V1   V3   V4   V5 
    # 0.25 0.20 0.20 0.35 
    

    Though this doesn't match your expected output, I suspect that that's because you were just demonstrating what it would look like ...