Search code examples
rrow-number

R row_number() results in only 1s


I want to produce a table where each patient has a row number in order of highest RDI to lowest RDI.

Sample Data, RDItable:

SubjectID RDI SomeOtherCols
1004 100 x
1004 100 y
1004 100 z
1002 99.5 x
1002 99.5 y
1002 99.5 z
1001 85.1 x
1001 85.1 y
1001 85.1 z
1003 100 x
1003 100 y
1003 100 z

Here is what had worked for me in the past with minor adjustments (by including the RDI column)

attempt1<- RDItable %>% 
 distinct(SubjectID, RDI) %>% 
 arrange(desc(RDI)) %>% 
 mutate(row_id = row_number(RDI))

but this results in:

SubjectID RDI row_id
1004 100 1
1003 100 1
1002 99.5 1
1001 85.1 1

and I want each row to contain a different value:

SubjectID RDI row_id
1004 100 1
1003 100 2
1002 99.5 3
1001 85.1 4

Solution

  • It is possible that the data already had a rowwise attribute which makes the row_number() to return only 1 as there is only a single row per group. We may need to ungroup if that is the case

    library(dplyr)
    RDItable %>%
       ungroup %>%
       distinct(SubjectID, RDI) %>%
        arrange(desc(RDI)) %>% 
        mutate(row_id = row_number())
    

    -output

      SubjectID   RDI row_id
    1      1004 100.0      1
    2      1003 100.0      2
    3      1002  99.5      3
    4      1001  85.1      4
    

    The output can be replicated with rowwise attribute

    > RDItable %>%
        rowwise %>% 
        distinct(SubjectID, RDI) %>%
        arrange(desc(RDI)) %>%
        mutate(row_id = row_number())
    # A tibble: 4 × 3
    # Rowwise: 
      SubjectID   RDI row_id
          <int> <dbl>  <int>
    1      1004 100        1
    2      1003 100        1
    3      1002  99.5      1
    4      1001  85.1      1