Search code examples
rdataframetidyverse

R: How to add range of defined strings inside string containing variable?


Here is the example datasets:

id <- c("aaa1", "aaa1", "bbb1", "aaa2", "b1","a3", "a1", "b1", "a1", "b1" )

data <- data.frame(id)
head(data)
#      id
# 1  aaa1
# 2  aaa1
# 3  bbb1
# 4  aaa2
# 5    b1
# 6    a3
# 7    a1
# 8    b1
# 9    a1
# 10   b1

And I can add strings to "id" variable. (A1. A2 etc...)

library(dplyr)

data_df <- data |> mutate(id = paste0("A", cumsum(id != lag(id, default = first(id))) + 1,".", id))

#         id
# 1  A1.aaa1
# 2  A1.aaa1
# 3  A2.bbb1
# 4  A3.aaa2
# 5    A4.b1
# 6    A5.a3
# 7    A6.a1
# 8    A7.b1
# 9    A8.a1
# 10   A9.b1

However I would like to know, whether it is possible to add range of strings, for example: A1:A5 then B1:B5 etc...

Here is the expected output

#         id
# 1  A1.aaa1
# 2  A2.aaa1
# 3  A3.bbb1
# 4  A4.aaa2
# 5    A5.b1
# 6    B1.a3
# 7    B2.a1
# 8    B3.b1
# 9    B4.a1
# 10   B5.b1

R tidyverse and base solutions are preferred. Thank you.


Solution

  • You can repeat the string A and B each for five times, then paste them with 1:5.

    library(dplyr)
    
    data |> mutate(id = paste0(rep(c("A", "B"), each = 5), 1:5, ".", id))
    
            id
    1  A1.aaa1
    2  A2.aaa1
    3  A3.bbb1
    4  A4.aaa2
    5    A5.b1
    6    B1.a3
    7    B2.a1
    8    B3.b1
    9    B4.a1
    10   B5.b1