I need to calculate percentile ranks for all the values in four columns in a dataset. The result should be something like this:
Name Value1 Percentile1 Value2 Percentile2 Value3 Percentile3 Value4 Percentile4
a X 0.000000 .... .... ....
b X 0.159272 .... .... ....
c X 1.000000 .... .... ....
d X 0.240728 .... .... ....
...
The format of each percentile is 6-decimal. Could anyone please help with this? I tried ntile() but it can't give me 6 decimal numbers.
let's first generate some data
library(tidyverse)
set.seed(1)
df <- tibble(
name = letters,
value1 = rnorm(length(letters)),
value2 = -rnorm(length(letters)),
value3 = abs(rnorm(length(letters))) )
Function for calculating percentile ranks (source: https://stats.stackexchange.com/a/11928)
perc.rank <- function(x) trunc(rank(x))/length(x)
df %>% mutate(
percentile1 = perc.rank(value1),
percentile2 = perc.rank(value2),
percentile3 = perc.rank(value3)
) -> df
> df
name value1 value2 value3 percentile1 percentile2 percentile3
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a -0.626 0.156 0.341 0.192 0.615 0.308
2 b 0.184 1.47 1.13 0.462 1 0.731
3 c -0.836 0.478 1.43 0.115 0.808 0.808
4 d 1.60 -0.418 1.98 1 0.308 0.923