Search code examples
rcbind

parallel cbind of data frames when number of rows in data frames could differ


Consider the following example-

a1<-data.frame(a=c(1,2,3),b=c(4,5,6))
a2<-data.frame(a=c(5,6),b=c(7,8))
a3<-data.frame(e=c(34,26),f=c(41,65))
a4<-data.frame(e=c(13,25,567),f=c(14,57,56))

I want to cbind a1 to a3 after dropping the last row of a1, and a2 to a4 after dropping the last row of a4 to produce

  a b  e  f
1 1 4 34 41
2 2 5 26 65

and

  a b  e  f
1 5 7 13 14
2 6 8 25 57

Map(cbind, list(a1,a2),list(a3,a4)) as has been suggested elsewhere will work if only if all data frames have the same number of rows. How do I cbind after dropping extra rows in any of the constituent data frames in the cbind?


Solution

  • We can get all the dataframe in a list using mget divide them into two halves. Use them in Map, get the minimum number of rows from both the dataframe, subset those rows and cbind.

    list_dfs <- mget(paste0('a', 1:4))
    
    Map(function(x, y) {
        rows = seq_len(min(nrow(x), nrow(y)))
        cbind(x[rows, ], y[rows, ])
        },list_dfs[1:(length(list_dfs)/2)], 
          list_dfs[(length(list_dfs)/2 + 1):length(list_dfs)])
    
    #$a1
    #  a b  e  f
    #1 1 4 34 41
    #2 2 5 26 65
    
    #$a2
    #  a b  e  f
    #1 5 7 13 14
    #2 6 8 25 57