Search code examples
rdata-manipulationdata-cleaning

Error when transforming wide data to long data using reshape () and pivot_long()


I am trying to transform my data from wide to long but keep getting a few error:

Current data structure

Country       Month        New_users Returning_users  Active_users  Sum_byyear Sum_bycoun
Angola        2020-06-01   5         0                5             17         47  
Angola        2020-09-01   6         2                8             17         47
Angola        2020-10-01   3         1                4             17         47
Angola        2021-01-01   9         7                16            30         47
Angola        2021-03-01   5         3                8             30         47            
Angola        2021-05-01   5         1                6             30         47
South Africa  2020-06-01   40        9                49            157        814
South Africa  2020-09-01   65        13               78            157        814
South Africa  2020-10-01   20        10               30            157        814
South Africa  2021-01-01   47        37               84            657        814
South Africa  2021-03-01   224       66               290           657        814
South Africa  2021-05-01   206       77               283           657        814
 

Desired data structure

Country    Month        Users            Count      Sum_by_year     Sum_by_country
Angola     2020-06-01   New_users        5          17              47
Angola     2020-06-01   Returning_users  0          17              47
Angola     2020-06-01   Active_users     5          17              47
Angola     2020-09-01   New_users        6          17              47
Angola     2020-09-01   Returning_users  2          17              47
Angola     2020-09-01   Active_users     8          17              47
Angola     2020-10-01   New_users        3          17              47
Angola     2020-10-01   Returning_users  1          17              47
Angola     2020-10-01   Active_users     4          17              47
Angola     2021-01-01   New_users        9          30              47
Angola     2021-01-01   Returning_users  7          30              47
Angola     2021-01-01   Active_users     16         30              47
Angola     2021-03-01   New_users        5          30              47
Angola     2021-03-01   Returning_users  3          30              47
Angola     2021-03-01   Active_users     8          30              47
Angola     2021-05-01   New_users        5          30              47
Angola     2021-05-01   Returning_users  1          30              47
Angola     2021-05-01   Active_users     6          30              47
Ghana     2020-06-01   New_users        40          157              814
Ghana     2020-06-01   Returning_users  9           157              814
Ghana     2020-06-01   Active_users     49          157              814
Ghana     2020-09-01   New_users        65          157              814
Ghana     2020-09-01   Returning_users  13          157              814
Ghana     2020-09-01   Active_users     78          157              814
Ghana     2020-10-01   New_users        20          157              814
Ghana     2020-10-01   Returning_users  10          157              814
Ghana     2020-10-01   Active_users     30          157              814
Ghana     2020-09-01   Active_users     78          157              814
Ghana     2021-01-01   New_users        47          657              814
Ghana     2021-01-01   Returning_users  37          657              814
Ghana     2021-01-01   Active_users     84          657              814
Ghana     2021-03-01   New_users        224         657              814
Ghana     2021-03-01   Returning_users  66          657              814
Ghana     2021-03-01   Active_users     283         657              814
Ghana     2021-05-01   New_users        206         657              814
Ghana     2021-05-01   Returning_users  77          657              814
Ghana     2021-05-01   Active_users     283         657              814

Essentially it should be converted to long varying by both country and date. I have converted date and every other variable is in the correct format.

What I've tried:

reshape (df, dir = "long", sep = ","
                  idvar = "Country",
                  varying= c("Month"))
Error in guess (varying):
   Failed to guess time-varying variables from their names 

Another method

df%>%pivot_longer (Country, names_to = c(".value", "User"), names_pattern = "(.*)(\\d+)")

Error in 'stop_vctrs()':
!Can't recycle '..1' (size 656) to match '...3' (size 0)

Please assist.


Solution

  • df <- read.table(text = "Country       Month        New_users Returning_users  Active_users  Sum_byyear Sum_bycoun
    Angola        2020-06-01   5         0                5             17         47  
    Angola        2020-09-01   6         2                8             17         47
    Angola        2020-10-01   3         1                4             17         47
    Angola        2021-01-01   9         7                16            30         47
    Angola        2021-03-01   5         3                8             30         47            
    Angola        2021-05-01   5         1                6             30         47
    SouthAfrica  2020-06-01   40        9                49            157        814
    SouthAfrica  2020-09-01   65        13               78            157        814
    SouthAfrica  2020-10-01   20        10               30            157        814
    SouthAfrica  2021-01-01   47        37               84            657        814
    SouthAfrica  2021-03-01   224       66               290           657        814
    SouthAfrica  2021-05-01   206       77               283           657        814", h = T)
    
    library(tidyverse)
    df %>% 
      pivot_longer(c(New_users, Returning_users, Active_users), names_to = "Users", values_to = "Count") %>% 
      select(Country, Month, Users, Count, Sum_byyear, Sum_bycoun)
    
    # A tibble: 36 x 6
       Country Month      Users           Count Sum_byyear Sum_bycoun
       <chr>   <chr>      <chr>           <int>      <int>      <int>
     1 Angola  2020-06-01 New_users           5         17         47
     2 Angola  2020-06-01 Returning_users     0         17         47
     3 Angola  2020-06-01 Active_users        5         17         47
     4 Angola  2020-09-01 New_users           6         17         47
     5 Angola  2020-09-01 Returning_users     2         17         47
     6 Angola  2020-09-01 Active_users        8         17         47
     7 Angola  2020-10-01 New_users           3         17         47
     8 Angola  2020-10-01 Returning_users     1         17         47
     9 Angola  2020-10-01 Active_users        4         17         47
    10 Angola  2021-01-01 New_users           9         30         47
    # ... with 26 more rows