I'm running the Aprori algorithm in R using Arules. I have a massive amount of data to mine and I don't want to use a sample if at all possible. I really only need to see rules associated with items that are not sold very often.
The code i'm using now is:
basket_rules <- apriori(data, parameter = list(sup = 0.7, conf = 0.2, target="rules",list(minlen=4, maxlen=7))
I only want rules with low support but because of the size and nature of my data I cant get it any lower than .7 Is it possible to return a a range of support in order to conserve memory.
for example something like: list(sup <=.05 and >=.0001)
any other ideas for limiting memory usage while running the Aprori is really appreciated.
The nature of support (downward closure) does not allow you to efficiently generate only itemsets/rules with a support in a specific range. You always have to create all frequent itemsets first and then filter in the R implementation in arules
. There might be implementations of FP-growth or similar algorithms which are more memory efficient for your problem.
Another way to approach this problem is to look at the data more closely. Maybe you have several items which appear in many transactions. These items might not be not interesting to you and you can remove them before mining rules.