Search code examples
rtidyversetidyr

I lose the constant variables (including id) when using pivot_longer with multiple variables


I try to reshape the following

country region abc2001 abc2002 xyz2001 xyz2002
Japan East Asia 1 2 4.5 5.5

to the following

country region year abc xyz
Japan East Asia 2001 1 4.5
Japan East Asia 2002 2 5.5

actually there are five more variables in the same way.

I use the following code:

long <- data %>% pivot_longer(cols = c(-country, -region), names_to = c(".value", "year"), names_pattern = "([^\\.]*)\\.*(\\d{4})")

The result is long version of the data except that I lose country and region variables. What do I do wrong? Or how else can I do this better?

Thank you in advance.


Solution

  • We may change the regex pattern to match one or more non-digits(\\D+) as the first capture group and one or more digits (\\d+) as the second one

    librarytidyr)
    pivot_longer(data, cols = c(-country, -region),
        names_to = c(".value", "year"), names_pattern = "(\\D+)(\\d+)")
    

    -output

    # A tibble: 2 × 5
      country region    year    abc   xyz
      <chr>   <chr>     <chr> <int> <dbl>
    1 Japan   East Asia 2001      1   4.5
    2 Japan   East Asia 2002      2   5.5
    

    data

    data <- structure(list(country = "Japan", region = "East Asia", abc2001 = 1L, 
        abc2002 = 2L, xyz2001 = 4.5, xyz2002 = 5.5), 
    class = "data.frame", row.names = c(NA, 
    -1L))