I want to create a random ID variable considering an actual ID. That means that observations with the same id must have the same random ID. Let me put an example:
id var1var2
1 a 1
5 g 35
1 hf 658
2 f 576
9 d 54546
2 dg 76
3 g 5
3 g 5
5 gg 56
6 g 456
8v g 6
9 e 778795
The expected result is:
id var1var2id random
1 a 1 9
5 g 35 1
1 hf 658 9
2 f 576 8
9 d 54546 3
2 dg 76 8
3 g 5 7
3 g 5 7
5 gg 56 1
6 g 456 5
8v g 6 4
9 e 778795 3
To create a new id by group, use match
with sample
, or cur_group_id
in dplyr
. The ids will start from 1 until the number of total groups is reached.
dat$random_id <- match(dat$id, sample(unique(dat$id)))
library(dplyr)
dat %>%
group_by(id = factor(id, levels = sample(unique(id)))) %>%
mutate(random_id = cur_group_id())
output
id var1 var2 random_id
1 1 a 1 6
2 5 g 35 2
3 1 hf 658 6
4 2 f 576 4
5 9 d 54546 5
6 2 dg 76 4
7 3 g 5 7
8 3 g 5 7
9 5 gg 56 2
10 6 g 456 1
11 8 g 6 3
12 9 e 778795 5