Search code examples
rassociationsapriorimarket-basket-analysis

Association Rules using Arules and ArulesViz from Data


I have an R data frame with customer_id and product_name. A customer can have multiple products. Within the customer column there are duplicate customer_ids due to them having multiple products.

I'm trying to do a basic apriori analysis and determine some association rules for products purchased together. I would like to use the Arules and ArulesViz package in R to do this.

When I've tried running this i usually get 0 rules or lhs product --> rhs customer_id. So I don't believe I'm loading the data correctly to view multiple products to a single customer to derive the associations.

Any help would be appreciated!

Basic Data Frame Example

df <- data.frame( cust_id = as.factor(c('1aa2j', '1aa2j', '2b345',
'2b345', 'g78a8', 'y67r3')), product = as.factor(c("Bat", "Sock",
"Hat", "Shirt", "Ball", "Shorts")))

rules <- apriori(df) inspect(rules)

lhs rhs support confidence lift 1 {product=Bat} => {cust_id=1aa2j} 0.167 1 3
2 {product=Sock} => {cust_id=1aa2j} 0.167 1 3
3 {product=Hat} => {cust_id=2b345} 0.167 1 3
4 {product=Shirt} => {cust_id=2b345} 0.167 1 3
5 {cust_id=g78a8} => {product=Ball} 0.167 1 6
6 {product=Ball} => {cust_id=g78a8} 0.167 1 6
7 {cust_id=y67r3} => {product=Shorts} 0.167 1 6
8 {product=Shorts} => {cust_id=y67r3} 0.167 1 6

Solution

  • This is taken from the examples for transactions (slightly modified):

    library(arules)
    df <- data.frame( cust_id = as.factor(c('1aa2j', '1aa2j', '2b345',
    '2b345', 'g78a8', 'y67r3')), product = as.factor(c("Bat", "Sock",
    "Hat", "Shirt", "Ball", "Shorts")))
    
    trans <- as(split(df[,"product"], df[,"cust_id"]), "transactions")
    inspect(trans)
    
        items       transactionID
    [1] {Bat,Sock}  1aa2j        
    [2] {Hat,Shirt} 2b345        
    [3] {Ball}      g78a8        
    [4] {Shorts}    y67r3 
    

    Now you can use apriori on trans.