Search code examples
rhashr-factor

Renaming large IDs


Suppose I have a data.frame with N rows. The id column has 10 unique values; all those values are integers greater than 1e7. I would like to rename them to be numbered 1 through 10 and save these new IDs as a column in my data.frame.

Additionally, I would like to easily determine 1) id given id.new and 2) id.new given id.

For example:

> set.seed(123)
> ids <- sample(1:1e7,10)
> A <- data.frame(id=sample(ids,100,replace=TRUE),
                  x=rnorm(100))
> head(A)
       id          x
1 4566144  1.5164706
2 9404670 -1.5487528
3 5281052  0.5846137
4  455565  0.1238542
5 7883051  0.2159416
6 5514346  0.3796395

Solution

  • Try this:

    A$id.new <- match(A$id,unique(A$id))
    

    Additional comment: To get the table of values:

    rbind(unique(A$id.new),unique(A$id))