Search code examples
rlistnested-loopscbind

R: How can I cbind specific columns of all data frames of a nested loop within the loop?


I am trying to combine the third column of several data frames, which are called and renamed in a nested for loop, within the same looping process.

# Sample Data
ecvec_msa6_1998=matrix( round(rnorm(200, 5,15)), ncol=4)
ecvec_msa6_1999=matrix( round(rnorm(200, 4,16)), ncol=4)
ecvec_msa6_2000=matrix( round(rnorm(200, 3,17)), ncol=4)

datasets=c("msa")
num_industrys=c(6)
years=c(1998, 1999, 2000)

alist=list() 

for (d in 1:length(datasets)) {
  dataset=datasets[d]
  for (n in 1:length(num_industrys)){
    num_industry=num_industrys[n]
    for (y in 1:length(years)) {
      year=years[y]

     eval(parse(text=paste0("newly_added = ecvec_", dataset, num_industry, "_",  year))) 
     # renaming the old data frames

     alist = list(alist, newly_added) # combining them in a list

     extracted_cols <- lapply(alist, function(x) x[3]) # selecting the third column

     result <- do.call("cbind", extracted_cols) # trying to cbind the third colum

    }
  }
}

Can somebody show me the right way to do this?


Solution

  • Your code almost works - here are a few changes...

    alist=list() 
    
    for (d in 1:length(datasets)) {
      dataset=datasets[d]
      for (n in 1:length(num_industrys)){
        num_industry=num_industrys[n]
        for (y in 1:length(years)) {
          year=years[y]
          eval(parse(text=paste0("newly_added = ecvec_", dataset, num_industry, "_",  year)))                                   
          #the next line produces the sort of list you want - yours was too nested
          alist = c(alist, list(newly_added))
        }
      }
    }
    
    #once you have your list, these commands should be outside the loop          
    extracted_cols <- lapply(alist, function(x) x[,3]) #note the added comma!
    result <- do.call(cbind, extracted_cols) #no quotes needed around cbind
    
    head(result)
         [,1] [,2] [,3]
    [1,]   11   13   24
    [2,]  -26   -3    7
    [3,]   -1  -26  -14
    [4,]    5   14  -15
    [5,]   28    3    8
    [6,]    9   -9   19
    

    HOWEVER - a much more R-like (and faster) way of doing this would be to replace all of the above with

    df <- expand.grid(datasets,num_industrys,years) #generate all combinations
    datanames <- paste0("ecvec_",df$Var1,df$Var2,"_",df$Var3) #paste them into a vector of names
    result <- sapply(datanames,function(x) get(x)[,3])
    

    sapply automatically simplifies the list into a dataframe if it can (lapply always produces a list)