Search code examples
raprioriarules

Getting error in pruning apriori rules in grocery dataset


I'm trying to prune the rules that created by apriori algorithm for groceries dataset but getting one error

Using R 3.4.2 and RStudio (Version 1.1.383)

Imported libraries

library(arules)
data("Groceries")

I have created the rules

rules <- apriori(Groceries, parameter = list(supp =0.001,
                                         conf = 0.5,
                                         target = "rules"))

Started pruning redundant rules

rules.sorted = sort(rules, by="lift")
subset.matrix <- is.subset(rules.sorted, rules.sorted)

While coverting lower triangle of the matrix to NA I got one warning

subset.matrix[lower.tri(subset.matrix, diag=T)] = NA

Warning message:

In `[<-`(`*tmp*`, as.vector(i), value = NA) :
x[.] <- val: x is “ngTMatrix”, val not in {TRUE, FALSE} is coerced; NA |--> TRUE

Then tried to identify redundant rules

redundant <- colSums(subset.matrix, na.rm=T) >= 1

Finally pruned rules

rules.pruned = rules.sorted[!redundant]

But while inspecting it showing nothing

inspect(rules.pruned)

Even summary of rules.pruned showing "zero" 0 rules

summary(rules.pruned)

I guess the error mainly due to the warning during conversion of matrix lower triangle to NA values which showed a warning

How to overcome the warning?


Solution

  • is.subset() returns since version 1.5-2 a sparse matrix (see package NEWS). If you want to use your code then you need to use:

    subset.matrix <- is.subset(rules.sorted, rules.sorted, sparse = FALSE)
    

    however, that is very inefficient and only works for very small rule sets. Use is.redundant() instead to find redundant rules.