Search code examples
rdataframemelt

How to most efficiently melt a data.frame with multiple measured column groups


I have a data frame of the form:

df <- data.frame("id_1" = seq(1,100, 1),
                 "id_2" = seq(1,100, 1),
                 "id_3" = seq(1,100, 1),
                 "m1_g1" = seq(1,100, 1),
                 "m2_g1" = seq(2,200, 2),
                 "m3_g2" = seq(3,300, 3),
                 "m4_g2" = seq(4,400, 4))

I would like to melt this so that the id columns serve as id's, but then I effectively have 2 rows per original entry that show me the m1-m4 columns where m1-m2 are of the same measure type (g1) and m3-m4 are of another measure type (g2)

The final data frame would look like this:

   id_1    id_2    id_3   var   value_1    value_2
1    1       1       1     1       1           3
2    1       1       1     2       2           4
3    2       2       2     1       2           6
4    2       2       2     2       4           8

I've tried using melt to create separate melted data frames and then pasting over the column, but I feel there has got to be a better way.

Thank you all!


Solution

  • Using tidyr, we can do it in one pivot, but we need to rename some of them first so that the leading m*_ is less different.

    library(dplyr)
    library(tidyr) # pivot_longer
    rename(df, m1_g2 = m3_g2, m2_g2 = m4_g2) %>%
      pivot_longer(-starts_with("id"), names_pattern = "m(.*)_g(.*)", names_to = c("val", ".value"))
    # # A tibble: 200 x 6
    #     id_1  id_2  id_3 val     `1`   `2`
    #    <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
    #  1     1     1     1 1         1     3
    #  2     1     1     1 2         2     4
    #  3     2     2     2 1         2     6
    #  4     2     2     2 2         4     8
    #  5     3     3     3 1         3     9
    #  6     3     3     3 2         6    12
    #  7     4     4     4 1         4    12
    #  8     4     4     4 2         8    16
    #  9     5     5     5 1         5    15
    # 10     5     5     5 2        10    20
    # # ... with 190 more rows