I am using r arules package to generate rules around a transaction dataset. In the dataset, I have over 500 transactions with items such as apples, beer and so on.
I know how to generate the rules and sort them based on support or confidence, but if I want to only look at the rules that involve certain items, how should I do so? Like I only want rules that having apple in it.
Something like:
inspect(rules[keyword='apple'])
You can do that with subset
.
inspect(subset(rules, subset = items %in% "apple"))
Since you do not provide your data, I will give a full example using data provided in the arules
package.
library(arules)
data(Groceries)
rules <- apriori(Groceries, parameter = list(supp = 0.001, conf = 0.8))
Now pick out the rules that mention yogurt. There are too many to show the full result, so I just show the first three.
inspect(subset(rules, subset = items %in% "yogurt")[1:3])
lhs rhs support confidence lift count
[1] {yogurt,
cereals} => {whole milk} 0.001728521 0.8095238 3.168192 17
[2] {yogurt,
rice} => {other vegetables} 0.001931876 0.8260870 4.269346 19
[3] {other vegetables,
yogurt,
specialty cheese} => {whole milk} 0.001321810 0.8125000 3.179840 13
None of these had yogurt on the rhs, so I also show rule 20 to show that it is catching yogurt there too.
inspect(subset(rules, subset = items %in% "yogurt")[20])
lhs rhs support confidence
[1] {other vegetables,butter milk,pastry} => {yogurt} 0.001220132 0.8
lift count
[1] 5.734694 12