Search code examples
rcombinationsrenamerecode

Recode Identifier for Different Combinations


I have a table in the following style:

Group Combi Value
----  ----  ----
x     A     1
x     A     2
x     B     1
x     B     3
x     C     2
x     C     3
y     D     1
y     D     2
y     E     1
y     E     3
y     F     2
y     F     3

I want to add another variable which renames the values in the "Combi" column in the following way: If I have the above table and "Combi" is e.g. A as in the first two rows, I want to change A to x_1_2 since both rows refer to "Group" x and the corresponding "Values" are "1" and "2" (note that a "Combi" is always assigned to exactly one "Group"). Thus, the table should look like this:

Group Combi Value Combi2
----  ----  ----  ----
x     A     1     x_1_2
x     A     2     x_1_2
x     B     1     x_1_3
x     B     3     x_1_3
x     C     2     x_2_3
x     C     3     x_2_3
y     D     1     y_1_2
y     D     2     y_1_2
y     E     1     y_1_3
y     E     3     y_1_3
y     F     2     y_2_3
y     F     3     y_2_3

Note that I always to sort the "Values" in a ascending order. Thus, I e.g. take y_2_3 and not y_3_2. Also note that I might have more than two entries per "Group" per "Combi". I would appreciate any help how to do this in R!

Best regards!


Solution

  • The following works using dplyr:

    require(dplyr);
    df %>%
        group_by(Group, Combi) %>%
        arrange(Group, Combi, Value) %>%
        mutate(Combi2 = paste(Group, paste0(Value, collapse = "_"), sep = "_"))
    
    # A tibble: 14 x 4
    # Groups:   Group, Combi [6]
    #   Group Combi Value Combi2
    #   <fct> <fct> <int> <chr>
    # 1 x     A         1 x_1_2_3_4
    # 2 x     A         2 x_1_2_3_4
    # 3 x     A         3 x_1_2_3_4
    # 4 x     A         4 x_1_2_3_4
    # 5 x     B         1 x_1_3
    # 6 x     B         3 x_1_3
    # 7 x     C         2 x_2_3
    # 8 x     C         3 x_2_3
    # 9 y     D         1 y_1_2
    #10 y     D         2 y_1_2
    #11 y     E         1 y_1_3
    #12 y     E         3 y_1_3
    #13 y     F         2 y_2_3
    #14 y     F         3 y_2_3
    

    Sample data

    df <- read.table(text =
        "Group Combi Value
    x     A     1
    x     A     2
    x     A     3
    x     A     4
    x     B     1
    x     B     3
    x     C     2
    x     C     3
    y     D     1
    y     D     2
    y     E     1
    y     E     3
    y     F     2
    y     F     3", header = T)