I have a column, Col1
in my dataset with random numbers .
ID Date Col1 Score
13 2002-08-01 18221 60.75
13 2010-08-12 18448 65.33
11 2009-11-06 -65145 61.13
11 2009-11-06 -65145 59.91
12 2011-05-10 93910 62.10
14 2009-05-29 13000 70.28
15 2008-12-03 19423 39.72
My goal is to convert these seemingly random numbers into an ordered sequence (SequenceNum)
like this.
ID Date Col1 Score SequenceNum
13 2002-08-01 18221 60.75 1
13 2010-08-12 18448 65.33 2
11 2009-11-06 -65145 61.13 2
11 2009-11-06 -65145 59.91 1
12 2011-05-10 93910 62.10 1
14 2008-12-03 19423 39.72 2
14 2009-05-29 13000 70.28 1
The rule is ,
For each ID ,
Rule 1) if the Col1 values are different, then create a sequence
based on the ascending order of values in Col1. For example ID 13.
The values in Col1 are 18221 and 18448 so the expected result is
ID Date Col1 Score SequenceNum
13 2002-08-01 18221 60.75 1
13 2010-08-12 18448 65.33 2
Rule 2) If the Col1 values are same, then use the values in column
`Score` and create a sequence based on ascending order of values in
column `Score`. For ID 11, the values in Col1 are same(-65145) so the
sequence number will be based on ascending order of values in column
`Score`
11 2009-11-06 -65145 61.13 2
11 2009-11-06 -65145 59.91 1
Thanks in advance of any help on this issue.
We can use
library(dplyr)
df1 %>%
mutate(rn = row_number()) %>%
arrange(ID, Col1, Score) %>%
mutate(SequenceNum = row_number(), .by = ID) %>%
arrange(rn) %>%
select(-rn)
-output
ID Date Col1 Score SequenceNum
1 13 2002-08-01 18221 60.75 1
2 13 2010-08-12 18448 65.33 2
3 11 2009-11-06 -65145 61.13 2
4 11 2009-11-06 -65145 59.91 1
5 12 2011-05-10 93910 62.10 1
6 14 2009-05-29 13000 70.28 1
7 14 2008-12-03 19423 39.72 2