Search code examples
rpivot-tablereshape

Reshape data with multiple columns to long format


I'm pretty sure, that a question like this was asked before but I cannot find any.

This is my dataset:

data.frame(Group = c("a", "b"), 
           MEAN_A = 1:2, 
           MEAN_B = 3:4, 
           MED_A = 5:6, 
           MED_B = 7:8) 
  Group MEAN_A MEAN_B MED_A MED_B
1     a      1      3     5     7
2     b      2      4     6     8

What I want is the following:

data.frame(Group = c("a", "a", "b", "b"), 
           Name = c("MEAN", "MED", "MEAN", "MED"),
           Value_A = c(1, 5, 2, 6),
           Value_B = c(3, 7, 4, 8))
  Group Name Value_A Value_B
1     a MEAN       1       3
2     a  MED       5       7
3     b MEAN       2       4
4     b  MED       6       8

So I want to keep the variable Group and have a new column which tells me, if the original variable was from MEAN or MED and two columns with the Values of A and B, that where initially in the variable names after MEAN or MED.

I've already tried pivot_longer, even with patterns, but I'm not able to get my desired output.


Solution

  • Here's one approach:

    Select the group and "mean" columns, rename the means to "Value_", add a "mean" identifier. Select the group and "med" columns, rename the meds to "Value_", add a "med" identifier. bind the frames together, sort by "Group":

    df %>% select(1:3) %>%
      rename_with(~gsub(pattern = "MEAN", replacement = "Value", .), .cols = starts_with("MEAN")) %>%
      mutate(Name = "MEAN") %>%
      rbind(df %>%
                  select(c(1,4,5)) %>% 
                  rename_with(~gsub(pattern = "MED", replacement = "Value", .), .cols = starts_with("MED")) %>%
                  mutate(Name = "MED")) %>%
      select(Group, Name, Value_A, Value_B) %>%
      arrange(Group)
    

    gives

      Group Name Value_A Value_B
    1     a MEAN       1       3
    2     a  MED       5       7
    3     b MEAN       2       4
    4     b  MED       6       8
    

    Edit: another tidy approach:

    df %>% 
      pivot_longer(cols = any_of(c(ends_with("_A"), ends_with("_B"))),
                   names_to = c("Name", ".value"),
                   names_sep = "_") %>%
      rename(Value_A = A, Value_B = B)
    
    # A tibble: 4 × 4
      Group Name  Value_A Value_B
      <chr> <chr>   <int>   <int>
    1 a     MEAN        1       3
    2 a     MED         5       7
    3 b     MEAN        2       4
    4 b     MED         6       8