Search code examples
raprioriarules

Matching association rule to source records


I use the following code and receiving the appropriate association rules:

library("arules")
data("Adult")
    rules <- apriori(Adult,parameter = list(supp = 0.5, conf = 0.9, target = "rules"))
  labels(rules)

There are 50 rules. I would like to go back to the source data:

Adult3<-as.data.frame(as(Adult,"matrix"))

And to add new column to Adult3$RUL_NUM. This column will include the value of relevant rule (from 1 to 50 in this case) that the record complies with (for each record). If there are more than one rule per record I would like to add the last rule that record complies with it.


Solution

  • You probably want to look into the is.superset function. For example

    is.superset(Adult, lhs(rules))
    

    will give you a logical matrix that indicates for each transaction which rule is "relevant" (i.e., where all the items in the LHS are present).

    Edit: If you want to match the whole rule then use the code Avi suggested below:

    is.superset(Adult, lhs(rules))
    

    To get the id (number) of the last rule that matches you can use (more or less) straight-forward R code on the superset matrix:

    w <- sapply(apply(is.superset(Adult, rules), MARGIN = 1, which), tail, n = 1)
    

    This code finds the column index of all ones in each row and then returns the last one.