I have example data like below, where for a unit, I have multiple treatments, and multiple measurements for each treatment in before and after periods. I want to do a before-after comparison, so I gather
, and spread
as shown below to achieve the desired output.
My question is: can this be done with one pivot_
command? I have been trying to work out if a properly constructed spec
can achieve this, but have not succeeded yet. One such attempt is below.
I think I would accept either a way to make it work, or a clear explanation of how spec
or pivoting in general works that explains why this is not possible. From the pivoting vignette I think I understand that:
.name
contains unique column names from the input table when pivoting longer.value
contains new column names that you want to have in the output when pivoting longerHowever, I do not know what the additional columns in spec
mean or when they are needed. I hoped that my spec
would understand that "before"
values of period
should go into a column named before
, but apparently it doesn't work like that.
library(tidyverse) # tidyr 0.8.99.9000
tbl <- tibble(
obsv_unit = rep(1:2, each = 4),
treatment = rep(c("A", "B"), each = 2, times = 2),
period = rep(c("before", "after"), times = 4),
measure1 = 1:8,
measure2 = 11:18
)
tbl
#> # A tibble: 8 x 5
#> obsv_unit treatment period measure1 measure2
#> <int> <chr> <chr> <int> <int>
#> 1 1 A before 1 11
#> 2 1 A after 2 12
#> 3 1 B before 3 13
#> 4 1 B after 4 14
#> 5 2 A before 5 15
#> 6 2 A after 6 16
#> 7 2 B before 7 17
#> 8 2 B after 8 18
tbl %>%
gather("measure", "value", starts_with("measure")) %>%
spread(period, value)
#> # A tibble: 8 x 5
#> obsv_unit treatment measure after before
#> <int> <chr> <chr> <int> <int>
#> 1 1 A measure1 2 1
#> 2 1 A measure2 12 11
#> 3 1 B measure1 4 3
#> 4 1 B measure2 14 13
#> 5 2 A measure1 6 5
#> 6 2 A measure2 16 15
#> 7 2 B measure1 8 7
#> 8 2 B measure2 18 17
spec <- tibble(
`.name` = c("measure1", "measure2", "measure1", "measure2"),
`.value` = c("before", "before", "after", "after"),
period = c("before", "before", "after", "after")
)
tbl %>% pivot_wider(spec = spec)
#> Expected a vector, not NULL
tbl %>% pivot_longer(spec = spec)
#> Failed to create output due to bad names.
#> Choose another strategy with `names_repair`
The purpose of the spec
is to handle situations with irregular patterns in column names (in wide format) and values (in long format). It is only useful for overcoming difficulties where a regular expression + glue cannot specify the before and after. If you look at the examples in the pivoting vignette, it doesn't accomplish anything that couldn't be done with rename
or mutate
and a single pivot.
spec
is never proposed as a method to pivot in two directions at once.
In conclusion, you need to use both pivot_longer
and pivot_wider
to accomplish this transformation.