Search code examples
rpercentile

How to: calculate percentile values for each values in R


I need to calculate percentile ranks for all the values in four columns in a dataset. The result should be something like this:

Name   Value1   Percentile1   Value2  Percentile2  Value3  Percentile3  Value4  Percentile4
 a       X        0.000000               ....            ....            ....
 b       X        0.159272               ....            ....            ....
 c       X        1.000000               ....            ....            ....
 d       X        0.240728               ....            ....            ....
...

The format of each percentile is 6-decimal. Could anyone please help with this? I tried ntile() but it can't give me 6 decimal numbers.


Solution

  • let's first generate some data

    library(tidyverse)
    set.seed(1)
    df <- tibble(
    name = letters, 
    value1 = rnorm(length(letters)),
    value2 = -rnorm(length(letters)),
    value3 = abs(rnorm(length(letters))) ) 
    

    Function for calculating percentile ranks (source: https://stats.stackexchange.com/a/11928)

    perc.rank <- function(x) trunc(rank(x))/length(x)
    
    
    df %>% mutate(
    percentile1 = perc.rank(value1),
    percentile2 = perc.rank(value2),
    percentile3 = perc.rank(value3)
    ) -> df
    
    
    > df
    
       name  value1  value2 value3 percentile1 percentile2 percentile3
       <chr>  <dbl>   <dbl>  <dbl>       <dbl>       <dbl>       <dbl>
     1 a     -0.626  0.156  0.341        0.192      0.615        0.308
     2 b      0.184  1.47   1.13         0.462      1            0.731
     3 c     -0.836  0.478  1.43         0.115      0.808        0.808
     4 d      1.60  -0.418  1.98         1          0.308        0.923