My data has an ID column and I want to rename ID as an rank. In other words I want to rename all 35-> 1, all 39->2, all 66 ->3 all 77-> 4 and all 90 -> 5 from below data.
I tried to use rank function but I could not deal with two tied value. I want to give same number for two tied value in id (e.g. all 35 get 1).
How can I change each id as an ascending order number?
ID
--
35
35
35
35
39
39
39
66
66
66
66
77
77
90
90
90
You can take advantage of the fact that factor variables assign sequential values starting from 1 to your sorted data:
ID <- c(35, 35, 35, 35, 39, 39, 39, 66, 66, 66, 66, 77, 77, 90, 90, 90)
as.numeric(as.factor(ID))
# [1] 1 1 1 1 2 2 2 3 3 3 3 4 4 5 5 5
This also turns out to be much faster than the other proposed approaches (even after factoring the unique(vect)
out of Vincent's sapply
function):
library(microbenchmark)
ID <- rnorm(10000)
microbenchmark(as.numeric(as.factor(ID)), funPascal(ID), funVincent(ID))
# Unit: milliseconds
# expr min lq median uq max neval
# as.numeric(as.factor(ID)) 23.94388 24.64445 25.17679 25.8263 34.68806 100
# funPascal(ID) 2754.19694 2822.37356 2875.71998 2929.9071 3471.90363 100
# funVincent(ID) 416.58985 438.13800 445.29766 458.8043 769.44278 100