Search code examples
rapplysapplyarules

Order a list of words in one column of R


I have the output data frame from apriori, the rules as given below:

rules
{A,B} => {C}
{C,A} => {B}
{A,B} => {D}
{A,D} => {B}
{A,B} => {E}
{E,A} => {B}

I got it till this point where I grouped the items in each rule (data.frame is df_basket)

rules           basket
{A,B} => {C}    A,B,C
{C,A} => {B}    C,A,B
{A,B} => {D}    A,B,D
{A,D} => {B}    A,D,B
{A,B} => {E}    A,B,E
{E,A} => {B}    E,A,B

I want to be able to order the basket in alphabetical order as given below:

rules           basket  Group
{A,B} => {C}    A,B,C   A,B,C
{C,A} => {B}    C,A,B   A,B,C
{A,B} => {D}    A,B,D   A,B,D
{A,D} => {B}    A,D,B   A,B,D
{A,B} => {E}    A,B,E   A,B,E
{E,A} => {B}    E,A,B   A,B,E

I used the code below which works fine for small data frames and gets the job done. The for loop is inefficient for large data frames. Please help me in optimizing this atomic operation in R:

for(i in 1:nrow(df_basket))
{
  df_basket$Basket[i]<- ifelse(1==1,paste(unlist(strsplit(df_basket$basket[i],","))
                                          [order(unlist(strsplit(df_basket$basket[i],",")))],collapse=","))

} 

Please let me know if there is anything easy or more direct to get the "Group" field of my data frame.


Solution

  • Try to adapt this solution:

    f<-function(x)
    {
      sorted<-sort(unlist(strsplit(x,",")))
      return(paste0(sorted,collapse = ","))
    
    }
    cbind(basket,unlist(lapply(basket,f)))
    

    Input data:

    basket<-c("A,B,C","C,A,B","A,B,D","A,D,B","A,B,E","E,A,B")
    

    Output:

         basket         
    [1,] "A,B,C" "A,B,C"
    [2,] "C,A,B" "A,B,C"
    [3,] "A,B,D" "A,B,D"
    [4,] "A,D,B" "A,B,D"
    [5,] "A,B,E" "A,B,E"
    [6,] "E,A,B" "A,B,E"