Search code examples
pythonrrpy2aprioriarules

Using subset from arules package in rpy2


It's easy to use apriori algorithm from package arules as:

import rpy2.interactive as r
arules = r.packages.importr("arules")
from rpy2.robjects.vectors import ListVector

od = OrderedDict()
od["supp"] = 0.0005
od["conf"] = 0.7
od["target"] = 'rules'

result = ListVector(od)

my_rules = arules.apriori(dataset, parameter=result)

However, apriori subset uses a different format in subset param:

rules.sub <- subset(rules, subset = rhs %in% "marital-status=Never-married" & lift > 2)

It's possible to use this subset function with rpy2?


Solution

  • If subset is (re)defined in the R package arules, the object arules obtained from importr will contain it. In your python code this will look like arules.subset.

    The parameter subset is a slightly different story because it is an R expression. There can be several ways to tackle this. One of them is to wrap it in an ad-hoc R function.

    from rpy2.robjects import r
    def mysubset(rules, subset_str):
        return r("function(rules) { arules::subset(rules, subset = %s) }" % \
                 subset_str)
    
    rules_sub = mysubset(rules,
                         "rhs %in% "marital-status=Never-married" & lift > 2)