r dplyr duplicates rename data-manipulation

Rename duplicated column values group by another column

I have a dataset in which the same id appears in different groups:

df <- read.table(text='id   group  
 1         A 
 2         A 
 2         A 
 1         B 
 1         B 
 2         B
 2         C
 2         C 
 1         C 
 2         D         
 1         D
 1         D', header=TRUE)

I want to rename the duplicated values under column id that are grouped by another column group. The expected output is:

How do I do that?

Solution

Here is a data.table approach using rleid() to generate a run-length id for each unique id and group combination. We can then just paste() that number to the existing id, where it is >1.

library(data.table)
setDT(df)

df[, id_num := rleid(group), id][
    ,
    id := fifelse(
        id_num == 1,
        as.character(id),
        paste(id, id_num, sep = "_")
    )
][, `:=`(id_num = NULL)]

print(df)

#         id  group
#     <char> <char>
#  1:      1      A
#  2:      2      A
#  3:      2      A
#  4:    1_2      B
#  5:    1_2      B
#  6:    2_2      B
#  7:    2_3      C
#  8:    2_3      C
#  9:    1_3      C
# 10:    2_4      D
# 11:    1_4      D
# 12:    1_4      D