Search code examples

Remove column labels from a transaction object

I have a data frame df like below:

df <- data.frame(V1 = c("Prod1", "Prod2", "Prod3"),
                 V2 = c("Prod3", "Prod1", "Prod2"), 
                 V3 = c("Prod2", "Prod1", "Prod3"), 
                 City = c("City1", "City2", "City3"))

When I convert this to transaction class, using the code:

tData <- as(df, "transactions")

I get a result like below:

    items                                   transactionID
[1] {V1=Prod1,V2=Prod3,V3=Prod2,City=City1} 1            
[2] {V1=Prod2,V2=Prod1,V3=Prod1,City=City2} 2            
[3] {V1=Prod3,V2=Prod2,V3=Prod3,City=City3} 3   

This means that I have V1=Prod1 and V2=Prod1 as separate products when they are actually the same. This is giving me problems when I use this for apriori algorithm.

How can I remove the column labels so that I get the transaction object as:

    items                                   transactionID
[1] {Prod1,Prod3,Prod2,City1} 1            
[2] {Prod2,Prod1,Prod1,City2} 2            
[3] {Prod3,Prod2,Prod3,City3} 3         

Please help.


  • You have a somewhat strange data format (with exactly the same number of items in each transaction). To convert this correctly you cannot use a data.frame, but you need a list of transactions.

    df <- data.frame(
      V1 = c("Prod1", "Prod2", "Prod3"),
      V2 = c("Prod3", "Prod1", "Prod2"), 
      V3 = c("Prod2", "Prod1", "Prod3"), 
      City = c("City1", "City2", "City3"))
    m <- as.matrix(df)
    l <- lapply(1:nrow(m), FUN = function(i) (m[i, ]))

    This is the list format with each transaction as a list element.

         V1      V2      V3    City 
    "Prod1" "Prod3" "Prod2" "City1" 
         V1      V2      V3    City 
    "Prod2" "Prod1" "Prod1" "City2" 
         V1      V2      V3    City 
    "Prod3" "Prod2" "Prod3" "City3" 

    Now it can be coerced into transations

    trans <- as(l, "transactions")
    [1] {City1,Prod1,Prod2,Prod3}
    [2] {City2,Prod1,Prod2}      
    [3] {City3,Prod2,Prod3} 

    You have some duplicate items in the transactions and these are removed.