Search code examples
rlisttransactional

Transaction list to basket data


I have a table like

ID    Productpurchased   Year
1A          Abc          2011
1A          Abc          2011       
1A          xyz          2011
1A          Abc          2012
2A          bcd          2013
2A          Abc          2013

Output required format

ID       Purchase basket     Year     Abc-count  xyz-count  bcd-count    
1A       (Abc,xyz)           2011      2           1          0
1A       (Abc)               2012      1           0          0
2A       (bcd , Abc)         2013      1           0          1

Solution

  • We can do this easily with data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID' , 'Year', paste the unique elements of 'Productpurchased' and assign (:=) it to create 'Purchase_basket' column, then dcast from 'long' to 'wide' specifying the fun.aggregate as length

    library(data.table)
    dcast(setDT(df1)[, Purchase_basket := toString(unique(Productpurchased)),.(ID, Year)],
           ID + Year + Purchase_basket ~paste0(Productpurchased, ".count"), length)
    #    ID Year Purchase_basket Abc.count bcd.count xyz.count
    #1: 1A 2011        Abc, xyz         2         0         1
    #2: 1A 2012             Abc         1         0         0
    #3: 2A 2013        bcd, Abc         1         1         0