Search code examples
rdplyrrenamepermutationcbind

How to name dataframes by permutation order of columns using first letter


From a single data.frame, I generated as many data.frames as permutation of columns in the main data.frame. From here, I would like 1) each permuted data.frame to be named after the permutation order by keeping the first letter of each column name, 2) cbind each data.frames with another one:

data1 <- data.frame("Alpha"=c(1,2), "Beta"=c(2,2), "Gamma"=c(4,8))
data2 <- data.frame("Delta"=c(22,3))

library(combinat)
idx <- permn(ncol(data1))
res <- lapply(idx, function(x) data1[x])
res
[[1]]
  Alpha Beta Gamma
1     1    2     4
2     2    2     8

[[2]]
  Alpha Gamma Beta
1     1     4    2
2     2     8    2

[[3]]
  Gamma Alpha Beta
1     4     1    2
2     8     2    2

...

[[6]]
  Beta Alpha Gamma
1    2     1     4
2    2     2     8

First, I would like each previous data.frame to be named after the permutation order by keeping the first letter of each column name so that it would display the following data.frames:

dataABG
  Alpha Beta Gamma
1     1    2     4
2     2    2     8

dataAGB
  Alpha Gamma Beta
1     1     4    2
2     2     8    2

dataGAB
  Gamma Alpha Beta
1     4     1    2
2     8     2    2

...

Then, I want to cbind each of the previous data frames with data2, keeping the previous dataframe names.


Solution

  • You can create the names using lapply in conjunction with a substring operation on the individual dataframe's column names. Of course, this assumes that you want to add every first letter of all columns to the name:

    names(res) <- unlist(lapply(res,function(x) sprintf('data%s',paste0(substr(colnames(x),1,1),collapse = ''))))
    
    res
    
    # $dataABG
    # Alpha Beta Gamma
    # 1     1    2     4
    # 2     2    2     8
    # 
    # $dataAGB
    # Alpha Gamma Beta
    # 1     1     4    2
    # 2     2     8    2
    # 
    # $dataGAB
    # Gamma Alpha Beta
    # 1     4     1    2
    # 2     8     2    2
    

    Now to append the column from data2, you can again use lapply:

    lapply(res,function(x) cbind(x,data2))
    
    # $dataABG
    # Alpha Beta Gamma Delta
    # 1     1    2     4    22
    # 2     2    2     8     3
    # 
    # $dataAGB
    # Alpha Gamma Beta Delta
    # 1     1     4    2    22
    # 2     2     8    2     3
    # 
    # $dataGAB
    # Gamma Alpha Beta Delta
    # 1     4     1    2    22
    # 2     8     2    2     3
    

    EDIT:

    In order to minimize the use of lapply, you can already cbind the data2 column when you select your permutation and subsequently exclude it from the name creation:

    library(combinat)
    idx <- permn(ncol(data1))
    res <- lapply(idx, function(x) cbind(data1[x],data2))
    
    names(res) <- unlist(lapply(res,function(x) sprintf('data%s',paste0(str_sub(colnames(x)[-length(colnames(x))],1,1),collapse = ''))))
    

    This will save you a whole lapply call.