I try to reshape the following
country | region | abc2001 | abc2002 | xyz2001 | xyz2002 |
---|---|---|---|---|---|
Japan | East Asia | 1 | 2 | 4.5 | 5.5 |
to the following
country | region | year | abc | xyz |
---|---|---|---|---|
Japan | East Asia | 2001 | 1 | 4.5 |
Japan | East Asia | 2002 | 2 | 5.5 |
actually there are five more variables in the same way.
I use the following code:
long <- data %>% pivot_longer(cols = c(-country, -region), names_to = c(".value", "year"), names_pattern = "([^\\.]*)\\.*(\\d{4})")
The result is long version of the data except that I lose country and region variables. What do I do wrong? Or how else can I do this better?
Thank you in advance.
We may change the regex pattern to match one or more non-digits(\\D+
) as the first capture group and one or more digits (\\d+
) as the second one
librarytidyr)
pivot_longer(data, cols = c(-country, -region),
names_to = c(".value", "year"), names_pattern = "(\\D+)(\\d+)")
-output
# A tibble: 2 × 5
country region year abc xyz
<chr> <chr> <chr> <int> <dbl>
1 Japan East Asia 2001 1 4.5
2 Japan East Asia 2002 2 5.5
data <- structure(list(country = "Japan", region = "East Asia", abc2001 = 1L,
abc2002 = 2L, xyz2001 = 4.5, xyz2002 = 5.5),
class = "data.frame", row.names = c(NA,
-1L))