Search code examples
rranking

how to change values with rank value (ordered value) in R? and I want to give same numbers when two values tie


My data has an ID column and I want to rename ID as an rank. In other words I want to rename all 35-> 1, all 39->2, all 66 ->3 all 77-> 4 and all 90 -> 5 from below data.
I tried to use rank function but I could not deal with two tied value. I want to give same number for two tied value in id (e.g. all 35 get 1).
How can I change each id as an ascending order number?

ID
--
35
35
35
35
39
39
39
66
66
66
66
77
77
90
90
90

Solution

  • You can take advantage of the fact that factor variables assign sequential values starting from 1 to your sorted data:

    ID <- c(35, 35, 35, 35, 39, 39, 39, 66, 66, 66, 66, 77, 77, 90, 90, 90)
    as.numeric(as.factor(ID))
    # [1] 1 1 1 1 2 2 2 3 3 3 3 4 4 5 5 5
    

    This also turns out to be much faster than the other proposed approaches (even after factoring the unique(vect) out of Vincent's sapply function):

    library(microbenchmark)
    ID <- rnorm(10000)
    microbenchmark(as.numeric(as.factor(ID)), funPascal(ID), funVincent(ID))
    # Unit: milliseconds
    #                       expr        min         lq     median        uq         max neval
    #  as.numeric(as.factor(ID))   23.94388   24.64445   25.17679   25.8263    34.68806   100
    #              funPascal(ID) 2754.19694 2822.37356 2875.71998 2929.9071  3471.90363   100
    #             funVincent(ID)  416.58985  438.13800  445.29766  458.8043   769.44278   100