So, i'm trying to make arules work with my data, I have the transaction_ID, Item_name and Item_ID. But if I call the apriori function for the item_name and transaction_ID, is too slow but if I call it with item_id and transaction_ID is really fast. So, is there a way to create the rules with item_id and then substitute the ids for it real name ? here is a code example to work with:
library(arules)
library(arulesViz)
products <- c(1,1,1,3,4,5,6,4)
transaction_id <- c(2,2,3,3,3,4,4,4)
dataset <- data.frame(products ,transaction_id)
dataset
transaction <- as(split(dataset[,"products"],dataset[,"transaction_id"]), "transactions")
rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))
inspect(rule)
products_id <- c(1,3,4,5,6)
names <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")
cod <- data.frame(products = products_id, names)
The best way is to replace the item labels in the transactions using the function itemLabels
:
itemLabels(transaction)
[1] "1" "3" "4" "5" "6"
itemLabels(transaction) <- c("nail","Black Hammer 127","White desk 12","green desk","pink car")
rule <- apriori(transaction, parameter = list(supp = 0.001, conf = 0.8))
inspect(rules)
lhs rhs support confidence
[1] {Black Hammer 127} => {nail} 0.3333333 1
...
split
is rather slow. The example in ?transactions
says about using split:
## Note: This is very slow for large datasets. It is much faster to
## read transactions in this format from disk using read.transactions()
## with format = "single".