Search code examples
rexpanddcast

Convert rows into columns taking the names from one variable


I have a dataset of this kind df:

df <- as.data.frame(matrix(c(1,1,1,1,2,2,
                            2020,2020,2020,2020,2020,2020, 
                            1,2,1,2,3,6, 
                            "Spain","Spain","France","France","Germany","Japan",
                            0,40,5,200,98,300, 
                            1,3,2,19,4,11), ncol = 6))
names(df) <- c('ID', 'year', 'month', 'country', 'buy', 'sell')

I want to transform df. Keeping ID, year and month as reference variables and based on country I am trying to convert the rows into new variables, in the following way df_modified:

df_modified <- as.data.frame(matrix(c(1,1,2,2,
                                      2020,2020,2020,2020, 
                                      1,2,3,6,
                                      0,40,"NA","NA",
                                      5,200,"NA","NA",
                                      1,3,"NA","NA", 
                                      2,19,"NA","NA", 
                                      "NA","NA",98,300, 
                                      "NA","NA",4,11), ncol = 9))
names(df_modified) <- c('ID', 'year', 'month', 'buy_Spain', 'sell_Spain', 'buy_France','sell_France', 'buy_Germany', 'sell_Germany')

I have tried:

library(reshape2)
df_modified <- dcast(df, id+year+month ~ country)

However it does not make the trick.

Note: Numeric variables are numeric in my original dataset (country is character, etc).

Any clue?

Regards


Solution

  • We can specify the value.var in dcast

    library(data.table)
    dcast(setDT(df),  ID + year + month ~ country, value.var = c('buy', 'sell'))
    

    Or using pivot_wider

    library(tidyr)
    library(dplyr)
     df %>%
       pivot_wider(names_from = country, values_from = c(buy, sell), 
            values_fill = list(buy = '0', sell = '0'))
    # A tibble: 4 x 11
    #  ID    year  month buy_Spain buy_France buy_Germany buy_Japan sell_Spain sell_France sell_Germany sell_Japan
    #  <chr> <chr> <chr> <chr>     <chr>      <chr>       <chr>     <chr>      <chr>       <chr>        <chr>     
    #1 1     2020  1     0         5          0           0         1          2           0            0         
    #2 1     2020  2     40        200        0           0         3          19          0            0         
    #3 2     2020  3     0         0          98          0         0          0           4            0         
    #4 2     2020  6     0         0          0           300       0          0           0            11     
    

    If we need to convert to numeric class, just do type.convert

    df %>%
        type.convert(as.is = TRUE) %>%
        pivot_wider(names_from = country, values_from = c(buy, sell), 
            values_fill = list(buy = 0, sell = 0))
    # A tibble: 4 x 11
    #     ID  year month buy_Spain buy_France buy_Germany buy_Japan sell_Spain sell_France sell_Germany sell_Japan
    #  <int> <int> <int>     <int>      <int>       <int>     <int>      <int>       <int>        <int>      <int>
    #1     1  2020     1         0          5           0         0          1           2            0          0
    #2     1  2020     2        40        200           0         0          3          19            0          0
    #3     2  2020     3         0          0          98         0          0           0            4          0
    #4     2  2020     6         0          0           0       300          0           0            0         11