Search code examples
rpanelreshapemelt

How to properly format my panel data in R?


I know that similar questions exist but I have tried everything from melt to reshape to Stacked, etc. and nothing is getting me even close.

My data currently looks like this:

ID Treatment Round_1_Decision1 Round_1_Decision2 Round_2_Decision1 Round_2_Decision2
1      2            1                0                0                1
2      1            0                0                1                1

I need it to instead look like this:

ID Treatment Round Decision1 Decision2
1      2       1       1         0
1      2       2       0         1
2      1       1       0         0
2      1       2       1         1

Any suggestions?


Solution

  • We may use pivot_longer - main things to consider is the names_to and names_pattern. Here, we need 'Round' column that should generate the values as the suffix of 'Round' in column name and the other column as the value of the column (.value). In the names_pattern, capture the column names substring as a group with capture group ((...)) i.e. -\\w+ - should match the 'Round', followed by the _, then capture the digits((\\d+)), then the underscore (_) followed by the next capture group ((.*)) that includes the rest of the characters for the .value part

    library(tidyr)
    pivot_longer(df1, cols = starts_with("Round"),
         names_to = c("Round", ".value"), names_pattern = "\\w+_(\\d+)_(.*)")
    

    -output

    # A tibble: 4 × 5
         ID Treatment Round Decision1 Decision2
      <int>     <int> <chr>     <int>     <int>
    1     1         2 1             1         0
    2     1         2 2             0         1
    3     2         1 1             0         0
    4     2         1 2             1         1
    

    data

    df1 <- structure(list(ID = 1:2, Treatment = 2:1, Round_1_Decision1 = 1:0, 
        Round_1_Decision2 = c(0L, 0L), 
    Round_2_Decision1 = 0:1, Round_2_Decision2 = c(1L, 
        1L)), class = "data.frame", row.names = c(NA, -2L))