Search code examples
rdataframepivot-table

Pivot wide to long in R


I need to pivot a matrix in R, and I'm a little stumped as to exactly the right way to get what I want. I'm sure pivot_longer is probably the answer, but I can't figure it out.

My current dataframe looks like this:

d <- data.frame(c(1,2),c(4,5),c(2,3),c(5,4))
names(d) <- (c("treat1-measure1","treat1-measure2",
               "treat2-measure1","treat2-measure2"))
d

  treat1-measure1 treat1-measure2 treat2-measure1 treat2-measure2
1               1               4               2               5
2               2               5               3               4

As you can see, we have the same two measure across two treatment for multiple independent runs of the experiment.

I'd like to pivot it to look like this:

p <- data.frame(c(1,1,2,2),c(1,2,2,3),c(4,5,5,4))
names(p) <- c("treat","m1","m2")
p

  treat m1 m2
1     1  1  4
2     1  2  5
3     2  2  5
4     2  3  4

In the real data, there are only 3 measures and two experimental groups, so a little manual work is doable. Thanks for any help


Solution

  • Here is a tidyverse approach with tidyr with a little help from tibble:

    library(tidyr)
    library(tibble)
    d %>% rownames_to_column(var = "index") %>%
      pivot_longer(cols = (-index)) %>%
      separate(col = "name", into = c("Treatment","Measure")) %>%
      pivot_wider(id_cols = c("index","Treatment"), names_from = "Measure") %>%
      select(-index)
    #  Treatment measure1 measure2
    #  <chr>        <dbl>    <dbl>
    #1 treat1           1        4
    #2 treat2           2        5
    #3 treat1           2        5
    #4 treat2           3        4
    

    Edit: If anyone ever finds this later, another approach that doesn't rely on creating an index column would be with unnest:

    d %>%
      pivot_longer(cols = everything()) %>%
      separate(col = "name", into = c("Treatment","Measure")) %>%
      pivot_wider(id_cols = c("Treatment"), names_from = "Measure") %>%
      unnest(cols = c("measure1", "measure2"))
    ## A tibble: 4 x 3
    #  Treatment measure1 measure2
    #  <chr>        <dbl>    <dbl>
    #1 treat1           1        4
    #2 treat1           2        5
    #3 treat2           2        5
    #4 treat2           3        4
    

    Edit 2: I recommend using @akrun's far superior method with the subtly documented .value sentinel from tidyr.