Search code examples
rtransactionsrulesapriorimarket-basket-analysis

arules substitution of item values in R


So, i'm trying to make arules work with my data, I have the transaction_ID, Item_name and Item_ID. But if I call the apriori function for the item_name and transaction_ID, is too slow but if I call it with item_id and transaction_ID is really fast. So, is there a way to create the rules with item_id and then substitute the ids for it real name ? here is a code example to work with:

library(arules)
library(arulesViz)

products <- c(1,1,1,3,4,5,6,4)
transaction_id <- c(2,2,3,3,3,4,4,4)
dataset <- data.frame(products ,transaction_id)
dataset
transaction <- as(split(dataset[,"products"],dataset[,"transaction_id"]), "transactions")

rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))

inspect(rule)

products_id <- c(1,3,4,5,6)
names <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")

cod <- data.frame(products = products_id, names)

Solution

  • The best way is to replace the item labels in the transactions using the function itemLabels:

    itemLabels(transaction)
      [1] "1" "3" "4" "5" "6"
    itemLabels(transaction) <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")
    rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))
    inspect(rules)
          lhs                                 rhs                support   confidence
     [1]  {Black Hammer 127}               => {nail}             0.3333333 1 
     ...
    

    split is rather slow. The example in ?transactions says about using split:

    ## Note: This is very slow for large datasets. It is much faster to 
    ## read transactions in this format from disk using read.transactions() 
    ## with format = "single".