Search code examples
rtransformtransformationreshape2dcast

R transforming data from columns to rows by variable


I am facing a problem with transforming my data frame. I would like to count how often (once in how many days) does each client buy. I thout that it would be easiest to transform my data about transactions formated as:

Transatcion_ID  Client_ID    Date
1               1            2017-01-01
2               1            2017-01-04
3               2            2017-02-21
4               1            2017-05-01
5               3            2017-02-04
6               3            2017-03-01
...             ...          ...

to :

Client_ID    Date_1_purchase     Date_2_purchase     Date_3_purchase         ...
1            2017-01-01          2017-01-04          2017-05-01              ...
2            2017-02-21          NA                  NA                      ...
3            2017-02-04          2017-03-01          NA                      ...

Or:

Client_ID    Date_First_purchase     Date_Last_purchase     Numberof_orders
1            2017-01-01              2017-05-01              3
2            2017-02-21              2017-02-21              1   
3            2017-02-04              2017-03-01              2  

I have tried using dcast but I couldnt achive what I wanted. I bet there is a way to do that or eaven calculating what I want without transforming dataset, but i did not find it.


Solution

  • We can create a sequence id with rowid to dcast from 'long' to 'wide' format

    library(data.table)
    dcast(setDT(df1), Client_ID ~ paste0("Date_", rowid(Client_ID), 
                                      "_purchase"), value.var = "Date")
    #   Client_ID Date_1_purchase Date_2_purchase Date_3_purchase
    #1:         1      2017-01-01      2017-01-04      2017-05-01 
    #2:         2      2017-02-21              NA              NA
    #3:         3      2017-02-04      2017-03-01              NA