Analysis of products purchased after certain days

I have been trying to do sequential analysis of products purchased after a certain period of time, like what products combination are being purchased after 7 days by customers and what proportion of customers are purchasing such combination, i have tried arulesSequence package but my data is structured in a way which doesn't go with the package, i have userid, date of purchase, product id and product name in columns, i have searched a lot but haven't got any clear way to do.

Dayy        UID         leaf_category_name  leaf_category_id
5/1/2018    47      Cubes               38860
5/1/2018    272     Pastas & Noodles    34616
5/1/2018    1827    Flavours & Spices   34619
5/1/2018    3505    Feature Phones      1506

this is the kind of data i have, UID stands for user id, leaf category is product purchased in simple terms. I have huge dataset with 2,049,278 rows.

codes i have tried-

library(Matrix)
library(arules)
library(arulesSequences)

library(arulesViz)

#splitting data into transactions
transactions <- as(split(data$leaf_category_id, data$UID), "transactions")

frequent_sequences <- cspade(transactions, parameter=list(support=0.5))

and

# Convert tabular data to sequences. Item is in
# column 1, sequence ID is column 2, and event ID is column 3.
seqs = make_sequences(data, item_col = 1, sid_col = 2, eid_col = 3)             

# generate frequent sequential patterns with minimum
# support of 0.1 and maximum of 6 elements
fseq = spade(seqs, 0.1, 6)

I want to look at sequence of products being purchased after certain days. Can someone help me with this?

Thank You

Solution

The apriori path is quite nice, however, not having your data, we can use a famous dataset as example, like Groceries (in your case, you can subset your data after the data you want):

library(arules)
data(Groceries)

# here you can see the product with the biggest support
frequentproducts <- eclat (Groceries, parameter = list(supp = 0.07, maxlen = 15)) 
inspect(frequentItems)
     items                         support    count
[1]  {other vegetables,whole milk} 0.07483477  736 
[2]  {whole milk}                  0.25551601 2513 
[3]  {other vegetables}            0.19349263 1903 
[4]  {rolls/buns}                  0.18393493 1809 
[5]  {yogurt}                      0.13950178 1372 
[6]  {soda}                        0.17437722 1715 
[7]  {root vegetables}             0.10899847 1072 
[8]  {tropical fruit}              0.10493137 1032 
[9]  {bottled water}               0.11052364 1087 
[10] {sausage}                     0.09395018  924 
[11] {shopping bags}               0.09852567  969 
[12] {citrus fruit}                0.08276563  814 
[13] {pastry}                      0.08896797  875 
[14] {pip fruit}                   0.07564820  744 
[15] {whipped/sour cream}          0.07168277  705 
[16] {fruit/vegetable juice}       0.07229283  711 
[17] {newspapers}                  0.07981698  785 
[18] {bottled beer}                0.08052872  792 
[19] {canned beer}                 0.07768175  764

If you prefere, you can plot it:

itemFrequencyPlot(Groceries, topN=5, type="absolute")

Then you can see the association rules:

association <- apriori (Groceries, parameter = list(supp = 0.001, conf = 0.5)) 
inspect(head(association_conf))


  lhs                                           rhs                support     confidence lift     count
[1] {rice,sugar}                               => {whole milk}       0.001220132 1          3.913649 12   
[2] {canned fish,hygiene articles}             => {whole milk}       0.001118454 1          3.913649 11   
[3] {root vegetables,butter,rice}              => {whole milk}       0.001016777 1          3.913649 10   
[4] {root vegetables,whipped/sour cream,flour} => {whole milk}       0.001728521 1          3.913649 17   
[5] {butter,soft cheese,domestic eggs}         => {whole milk}       0.001016777 1          3.913649 10   
[6] {citrus fruit,root vegetables,soft cheese} => {other vegetables} 0.001016777 1          5.168156 10

You can see in the last column the count, how many times appears the each rules: this could be read as "how many rows", and, if each rows is a customer, the number of customers. However you have to think about what do you mean with how many customer, if you want for example this a,b,a,c >>> count = 4 or a,b,a,c >>> count 3 (pseudocode). In this case, you have to evaluate your data.
edit
you can lastly have a look at this, as you've stated, there is also the cspade algorithm that can help.