r algorithm machine-learning associations arules

Association rules between many continuous variables

I have a large dataset and I'm trying to mining association rules between the variables.

My problem is that I have 160 variables among which I have to look for the association rules and also I have more than 1800 item-sets.

Furthermore my variables are continuous variables. To mining association rules, I usually used the apriori algorithm, but as is well known, this algorithm requires the use of categorical variables.

Does anyone have any suggestions on what kind of algorithm I can use in this case?

A restricted example of my dataset is the following:

ID_Order   Model     ordered quantity
A.1        typeX     20
A.1        typeZ     10
A.1        typeY     5
B.2        typeX     16
B.2        typeW     12
C.3        typeZ     1
D.4        typeX     8
D.4        typeG     4
...

My goal would be mining association rules and correlation between different products, maybe with a neural network algorithm in R Does anyone have any suggestions on how to solve this problem?

Thanks in advance

Solution

You can create transactions from your dataset like this:

library(dplyr)

This function is used to get the transactions per ID_Order

concat <- function(x) {
  return(list(as.character(x)))

}

Group df by ID_Order and concatenate. pull() returns the concatenated Models in a list.

a_list <- df %>% 
  group_by(ID_Order) %>% 
  summarise(concat = concat(Model)) %>%
  pull(concat)

Set names to ID_Order:

names(a_list) <- unique(df$ID_Order)

Then you can use the package arules:

Get object of transactions class:

transactions <- as(a_list, "transactions")

Extract rules. You can set minimum support and minimum confidence in supp and conf resp.

rules <- apriori(transactions, 
                 parameter = list(supp = 0.1, conf = 0.5, target = "rules"))

To inspect the rules use:

inspect(rules)

And this is what you get:

     lhs              rhs     support confidence lift      count
[1]  {}            => {typeZ} 0.50    0.50       1.0000000 2    
[2]  {}            => {typeX} 0.75    0.75       1.0000000 3    
[3]  {typeW}       => {typeX} 0.25    1.00       1.3333333 1    
[4]  {typeG}       => {typeX} 0.25    1.00       1.3333333 1    
[5]  {typeY}       => {typeZ} 0.25    1.00       2.0000000 1    
[6]  {typeZ}       => {typeY} 0.25    0.50       2.0000000 1    
[7]  {typeY}       => {typeX} 0.25    1.00       1.3333333 1    
[8]  {typeZ}       => {typeX} 0.25    0.50       0.6666667 1    
[9]  {typeY,typeZ} => {typeX} 0.25    1.00       1.3333333 1    
[10] {typeX,typeY} => {typeZ} 0.25    1.00       2.0000000 1    
[11] {typeX,typeZ} => {typeY} 0.25    1.00       4.0000000 1