Search code examples
rtidyrreshape2

Converting from long to wide, using pivot_wide() on two columns in R


I would like to transform my data from long format to wide by the values in two columns. How can I do this using tidyverse?

Updated dput

structure(list(Country = c("Algeria", "Benin", "Ghana", "Algeria", 
                       "Benin", "Ghana", "Algeria", "Benin", "Ghana"
), Indicator = c("Indicator 1", 
             "Indicator 1", 
             "Indicator 1", 
             "Indicator 2", 
             "Indicator 2", 
             "Indicator 2",
             "Indicator 3", 
             "Indicator 3", 
             "Indicator 3"
), Status = c("Actual", "Forecast", "Target", "Actual", "Forecast", 
          "Target", "Actual", "Forecast", "Target"), Value = c(34, 15, 5, 
                                                               28, 5, 2, 43, 5, 
                                                               1)), row.names 
= c(NA, -9L), class = c("tbl_df", "tbl", "data.frame"))


    Country Indicator   Status   Value
    <chr>   <chr>       <chr>    <dbl>
1 Algeria Indicator 1 Actual      34
2 Benin   Indicator 1 Forecast    15
3 Ghana   Indicator 1 Target       5
4 Algeria Indicator 2 Actual      28
5 Benin   Indicator 2 Forecast     5
6 Ghana   Indicator 2 Target       2
7 Algeria Indicator 3 Actual      43
8 Benin   Indicator 3 Forecast     5
9 Ghana   Indicator 3 Target       1

Expected output

Country Indicator1_Actual Indicator1_Forecast Indicator1_Target Indicator2_Actual

Algeria       34                    15                 5           28

etc

Appreciate any tips!

foo <- data %>% pivot_wider(names_from = c("Indicator","Status"), values_from = "Value") 

works perfectly!


Solution

  • I think the mistake is in your pivot_wider() command data %>% pivot_wider(names_from = Indicator, values_from = c(Indicator, Status))

    I bet you can't use the same column for both names and values.

    Try this code

    data %>% pivot_wider(names_from = c(Indicator, Status), values_from = Value))
    

    Explanation: Since you want the column names to be Indicator 1_Actual, you need both columns indicator and status going into your names_from

    It would be helpful if you provided example data and expected output. But I tested this on my dummy data and it gives the expected output -

    Data:

    # A tibble: 4 x 4
         a1    a2 a3       a4
      <int> <int> <chr> <dbl>
    1     1     5 s        10
    2     2     4 s        20
    3     3     3 n        30
    4     4     2 n        40
    

    Call : a %>% pivot_wider(names_from = c(a2, a3), values_from = a4)

    Output :

    # A tibble: 4 x 5
         a1 `5_s` `4_s` `3_n` `2_n`
      <int> <dbl> <dbl> <dbl> <dbl>
    1     1    10    NA    NA    NA
    2     2    NA    20    NA    NA
    3     3    NA    NA    30    NA
    4     4    NA    NA    NA    40
    

    Data here if you want to reproduce

    structure(list(a1 = 1:4, a2 = 5:2, a3 = c("s", "s", "n", "n"), 
        a4 = c(10, 20, 30, 40)), row.names = c(NA, -4L), class = c("tbl_df", 
    "tbl", "data.frame"))
    

    Edit : For the edited question after trying out the correct pivot_wider() command - It looks like your data could actually have duplicates, in which case the output you are seeing would make sense - I would suggest you try to figure out if your data actually has duplicates by using filter(Country == .., Indicator == .., Status == ..)