Search code examples
rcsvdataframeaprioriarules

R - Apriori function error


I'm currently having problems with the apriori function. The thing is I have a csv with data like the following:

Desc,Cantidad,Valor,Fecha,Lugar,UUID
DESCUENTO,1,-3405,2014-10-04T14:02:57,53100,7F74AFC0-FC28-4105-89A5-CD99416B50C7
DESCUENTO,1,-3405,2014-10-04T14:02:57,53100,7F74AFC0-FC28-4105-89A5-CD99416B50C7
DESCUENTO,1,-170,2014-09-05T15:10:24,83000,7F0C7F0B-BCFC-4FCA-8740-B36AE9932869
Descuento de TYK Dia,1,-156,2014-06-19T16:52:27,86280,1E08E51E-213A-4EE0-8FE9-492E677FF0C9
Descuento de TYK Dia,1,-139,2014-04-25T10:52:44,86280,AB802E63-2D0D-4B47-AB70-DDE007929F9F
DESCUENTO,1,-63,2014-07-04T13:53:10,83000,5B1F12BB-71DE-4734-A774-8D377757A880
REDONDEO,1,-1,2014-03-29T10:50:59,0,5B241EFA-6654-46EA-B47A-3CB76C5EA923
DESCUENTO,1,-1,2014-10-04T14:02:57,53100,7F74AFC0-FC28-4105-89A5-CD99416B50C7
DESCUENTO,1,-1,2014-10-04T14:02:57,53100,7F74AFC0-FC28-4105-89A5-CD99416B50C7
LAVADO,1,0,2014-05-27T18:18:11,44500,e5d540d6-0f98-4993-ec09-56887cd4a27d
TUA,1,0,2014-09-29T10:20:31,6500,1d8ada06-a8a1-4bd8-9356-851b5da28108
Transportación Aerea,1,0,2014-10-03T10:41:09,6500,5fc3925a-d08a-4cdc-be7e-ca02bd488d5b
OBSEQUIO LAVADO DE CARROCERIA,1,0,2014-04-07T13:45:55,91800,8148ab07-5804-4b2b-b37c-5323b394907a
Arroz Al Azafran Combos A,1,0,2014-08-19T11:50:34,11520,f09c23e6-dc60-4aaf-a1b8-1506d38f3585
Frijoles Charros A,1,0,2014-08-19T11:50:34,11520,f09c23e6-dc60-4aaf-a1b8-1506d38f3585
Pepsi Ch A,1,0,2014-08-19T11:50:34,11520,f09c23e6-dc60-4aaf-a1b8-1506d38f3585
FECHA DE CONSUMO 18/07/2014,1,0,2014-07-19T18:01:45,6060,0f0465aa-a75b-4f95-8e3b-43c13452cafb
CAMBIO DE ACEITE DE MOTOR,1,0,2014-02-01T11:18:53,39890,5BDF0742-CDF5-4F6B-9937-DF1CB00274ED
CAMBIO DE FILTRO DE ACEITE,1,0,2014-02-01T11:18:53,39890,5BDF0742-CDF5-4F6B-9937-DF1CB00274ED

Whole CSV (https://github.com/antonio1695/BaseX/blob/master/facturas1.csv) To download the file just click on find file and then you will see the file. So what I did was:

> df1 <- read.csv("facturas1.csv")
> rules <- apriori(df1,parameter=list(support=0.01,confidence=0.5))
Error in asMethod(object) : 
column(s) 3 not logical or a factor. Discretize the columns first.

Nevertheless, the problem is that the columns are discrete already and if I change the data in order for it to have column 3 in the place of column 2 and viceversa. It still says that that column 3 is not logical or a factor when it should say it about column 2 instead. Thanks!


Solution

  • After some research I found that the apriori function must take intervals in order for it to work properly, so when you use discretize you must add the parameter "categories" to select how many intervals you want. It isn't possible for it not to take intervals. I'll post the code here:

    I decided to take 20 intervals which are all depending on how often the value in the interval is repeated.

    df$Valor <- discretize(df$Valor, method="frequency",categories = 20)
    

    Hope it helps somebody.