I'm trying to find association rules from a CSV I have which has the folowing columns: Desc which is the description of what was bought and UUID which is the unique ID of each transaction from an individual. That means that it can be several Desc for one UUID
The type of association rules i'm trying to find is for instance, if I see that many different UUID have two Desc, call them meat and beer. A rule would show out saying: {Meat} => {Beer} with it's support, confidence and lift.
The csv can be found here: https://github.com/antonio1695/RStudio/blob/master/facturas_du.csv
What I'm trying to do is:
libary(arules)
df <- read.csv("facturas_du.csv")
rules <- apriori(df_du,parameter=list(support=0.01,confidence=0.3))
Nevertheless, it gives me association rules with very little support of the type:
{An UUID} => {A Desc}
Which isn't what i'm looking for.
I would like my UUID to be my transaction ID and have something like:
UUID DESC
123 Meat,Beer
I hope someone could help me find what to do. Thanks!
UUID should not be an item. You should convert your data to transactions manually first to see what and how your data is used. Here is what you currently do:
library(arules)
df <- read.csv("https://raw.githubusercontent.com/antonio1695/RStudio/master/facturas_du.csv")
head(df)
Desc UUID
1 CONSUMO 38BD37F1-06E9-476B-8779-E6E8139B2586
2 CONSUMO DE ALIMENTOS 2BE26034-ED04-407A-ACE7-51764EEBE8CF
3 CONSUMO DE ALIMENTOS 9b24977d-8e67-4b0f-a55f-c0e886561b9d
4 PAGO POR USO DE ESTACIONAMIENTO 6FAEBEF1-2CCB-4DAB-BD2F-E765EC093F56
5 COPIA CARTA B&N 1-99 HOJAS 4D3F3204-3699-42DE-A97B-8D0F990B54A5
6 IMPRESION CARTA B&N 1-99 HOJAS 4D3F3204-3699-42DE-A97B-8D0F990B54A5
trans <- as(df, "transactions")
inspect(head(trans))
items transactionID
1 {Desc=CONSUMO,
UUID=38BD37F1-06E9-476B-8779-E6E8139B2586} 1
2 {Desc=CONSUMO DE ALIMENTOS,
UUID=2BE26034-ED04-407A-ACE7-51764EEBE8CF} 2
3 {Desc=CONSUMO DE ALIMENTOS,
UUID=9b24977d-8e67-4b0f-a55f-c0e886561b9d} 3
4 {Desc=PAGO POR USO DE ESTACIONAMIENTO,
UUID=6FAEBEF1-2CCB-4DAB-BD2F-E765EC093F56} 4
5 {Desc=COPIA CARTA B&N 1-99 HOJAS,
UUID=4D3F3204-3699-42DE-A97B-8D0F990B54A5} 5
6 {Desc=IMPRESION CARTA B&N 1-99 HOJAS,
UUID=4D3F3204-3699-42DE-A97B-8D0F990B54A5} 6
I don't think this is what you want. Each transaction should be a set of items and not a combination of one product and one UUID. I highly recommend that you read the arules package vignette.