Search code examples
rdata.tabledata-manipulationreshape2dcast

R Long to wide with count and sum


I have a data as below:

#dt
Method   ID     Source     Amt
A         1          X      10
A         1          Y      20
C         1          Z      30
B         2          Y      15
D         2          Z      10
C         3          X      20
D         3          X      20
E         4          Z      10
E         4          Z      10

What I want is:

ID    Total_Amt     Method_A     Method_B     Method_C     Method_D     Method_E     Source_X     Source_Y     Source_Z 
1            60            2            0            1            0            0            1            1            1
2            25            0            1            0            1            0            0            1            1
3            40            0            0            1            1            0            2            0            0
4            20            0            0            0            0            2            0            0            2

For the Method and Source columns, I want to calculate the count by their ID and use dcast to transform to wide format and also add up Amt column by ID.

Any Help?


Solution

  • Here's one way using dplyr and tidyr libraries. We first calculate sum of Amt values for each ID, get the data in long format, count number of rows and get it back in wide format.

    library(dplyr)  
    library(tidyr)
    
    df %>%
      group_by(ID) %>%
      mutate(Amt = sum(Amt)) %>%
      pivot_longer(cols = c(Method, Source)) %>%
      count(ID, value, Amt, name) %>%
      pivot_wider(names_from = c(name, value), values_from = n, values_fill = 0)
    
    
    #     ID   Amt Method_A Method_C Source_X Source_Y Source_Z Method_B Method_D Method_E
    #  <int> <int>    <int>    <int>    <int>    <int>    <int>    <int>    <int>    <int>
    #1     1    60        2        1        1        1        1        0        0        0
    #2     2    25        0        0        0        1        1        1        1        0
    #3     3    40        0        1        2        0        0        0        1        0
    #4     4    20        0        0        0        0        2        0        0        2