Search code examples
raprioriarules

Apriori Error in R


I am working on using the apriori algorithm to create groupings/market baskets of my items. Below is summary of this dataset after converting it to a transaction class type.

I think my error has to do with the parameters being chosen in the apriori function. Any insights would be great.

summary(groceries)

transactions as itemMatrix in sparse format with
 57 rows (elements/itemsets/transactions) and
 817 columns (items) and a density of 0.03135133 

most frequent items:
                  A                   B                   C                   D             (Other) 
                 13                  13                  13                  12                  12                1397 

element (itemset/transaction) length distribution:
sizes
  3   4   5   6   7   8   9  10  13  14  16  17  18  22  29  30  32  33  34  40  43  45  55  77  86 111 118 353 
  7   4   4   4   3   4   4   3   1   3   2   1   1   1   1   2   1   1   1   1   1   1   1   1   1   1   1   1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   3.00    5.00    9.00   25.61   29.00  353.00 

includes extended item information - examples:
  labels
1      E
2      F
3      G


groceryrules<-apriori(groceries, 
                      parameter = list(support = 0.15, 
                      confidence = 0.05, 
                      minlen = 2))

When running this it works great but when I try lowering the support since not too many recommendations are showing up, it doesn't work.

I tried:

groceryrules<-apriori(groceries, 
                      parameter = list(support = 0.14, 
                      confidence = 0.05, 
                      minlen = 2))

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
       0.05    0.1    1 none FALSE            TRUE       5    0.14      2     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 7 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[817 item(s), 57 transaction(s)] done [0.00s].
sorting and recoding items ... [14 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 done [0.00s].
writing ... [90 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
Warning message:
In apriori(groceries, parameter = list(support = 0.14, confidence = 0.05,  :
  Mining stopped (maxlen reached). Only patterns up to a length of 5 returned!

Why would changing the support but this little of an amount cause an error?


Solution

  •   Mining stopped (maxlen reached). Only patterns up to a length of 5 returned!
    

    minlen and maxlen are to blame. you stated minlen =2 in your parameter list. you did not specify maxlen so the algo took the default value of 10 (check out in the algo output) however, maxtime (which you did not specify either and also was used with the default value of 5 second) means that if calculating a rule of length n, the calculation will take more than 5 seconds - then the algo stops with a warning like you got - stating - i only got to maxlen=5 before the maxtime rule was breached.

    checking subsets of size 1 2 3 4 5 done [0.00s].
    

    checking subsets of size 6 - will take too long so was skipped......

    so - either change maxtime (add to the parameter list same as minlen: maxtime=10 or maxtime=20 etc) or as in most cases - ignore the warning. this is not an error. is it really important for you to find rules longer than 5 items? I think not. you did not specify this it is only a default value