Search code examples
rtidyr

How to pivot in two directions at once?


I have example data like below, where for a unit, I have multiple treatments, and multiple measurements for each treatment in before and after periods. I want to do a before-after comparison, so I gather, and spread as shown below to achieve the desired output.

My question is: can this be done with one pivot_ command? I have been trying to work out if a properly constructed spec can achieve this, but have not succeeded yet. One such attempt is below.

I think I would accept either a way to make it work, or a clear explanation of how spec or pivoting in general works that explains why this is not possible. From the pivoting vignette I think I understand that:

  • .name contains unique column names from the input table when pivoting longer
  • .value contains new column names that you want to have in the output when pivoting longer

However, I do not know what the additional columns in spec mean or when they are needed. I hoped that my spec would understand that "before" values of period should go into a column named before, but apparently it doesn't work like that.

library(tidyverse) # tidyr 0.8.99.9000
tbl <- tibble(
  obsv_unit = rep(1:2, each = 4),
  treatment = rep(c("A", "B"), each = 2, times = 2),
  period = rep(c("before", "after"), times = 4),
  measure1 = 1:8,
  measure2 = 11:18
)
tbl
#> # A tibble: 8 x 5
#>   obsv_unit treatment period measure1 measure2
#>       <int> <chr>     <chr>     <int>    <int>
#> 1         1 A         before        1       11
#> 2         1 A         after         2       12
#> 3         1 B         before        3       13
#> 4         1 B         after         4       14
#> 5         2 A         before        5       15
#> 6         2 A         after         6       16
#> 7         2 B         before        7       17
#> 8         2 B         after         8       18

tbl %>%
  gather("measure", "value", starts_with("measure")) %>%
  spread(period, value)
#> # A tibble: 8 x 5
#>   obsv_unit treatment measure  after before
#>       <int> <chr>     <chr>    <int>  <int>
#> 1         1 A         measure1     2      1
#> 2         1 A         measure2    12     11
#> 3         1 B         measure1     4      3
#> 4         1 B         measure2    14     13
#> 5         2 A         measure1     6      5
#> 6         2 A         measure2    16     15
#> 7         2 B         measure1     8      7
#> 8         2 B         measure2    18     17

spec <- tibble(
  `.name` = c("measure1", "measure2", "measure1", "measure2"),
  `.value` = c("before", "before", "after", "after"),
  period = c("before", "before", "after", "after")
)

tbl %>% pivot_wider(spec = spec)
#> Expected a vector, not NULL
tbl %>% pivot_longer(spec = spec)
#> Failed to create output due to bad names.
#> Choose another strategy with `names_repair`

Solution

  • The purpose of the spec is to handle situations with irregular patterns in column names (in wide format) and values (in long format). It is only useful for overcoming difficulties where a regular expression + glue cannot specify the before and after. If you look at the examples in the pivoting vignette, it doesn't accomplish anything that couldn't be done with rename or mutate and a single pivot.

    spec is never proposed as a method to pivot in two directions at once.

    In conclusion, you need to use both pivot_longer and pivot_wider to accomplish this transformation.