Search code examples
rexceldataframeperformancenested-for-loop

Speed up my nested for loop in R and storing a very large data frame in excel beyond its limits


I'm struggling to create a dataframe using the following codelines in R for complete 36 hours and more. Any suggestions to make this code faster would be admired. However, by the end, I need to store this data frame in an excel file, but I noticed that this particular code required over 300 million rows, which exceeds the typical excel sheet length. I look forward to get help in storing this in an excel (or notepad) file for future use as well.

library(writexl)
team_b <- 10:120
individual_b <- 1:84
team_s <- 1:250
individual_s <- 1:150

d <- data.frame()
for (i in team_b) {
  for (j in individual_b) {
    for (k in team_s) {
      for (l in individual_s) {
        sc <- l/k 
        bu <- j/i 
        sr <- l/j 
        pi <- sc/bu
        if(bu>0.7||sc>0.7||sr>6){
          c = "unrealistic"
        }
        else{
          c = "realistic"
        }
        d <- rbind(d, data.frame(i,j,k,l,sc,bu,sr,pi,c)) 
      }
    }
  }
}
colnames(d) <- c("T_b", "I_b", "T_s", "I_s", "BU", "SC", "SR", "PI", "Comment")
#View(d)

write_xlsx(d, "d.xlsx")

Solution

  • library(data.table)
    d <- CJ(team_b, individual_b, team_s, individual_s) # generate all combinations
    setnames(d, c('i', 'j', 'k', 'l'))
    d[, sc := l/k] 
    d[, bu := j/i] 
    d[, sr := l/j] 
    d[, pi := sc/bu]
    d[, c := ifelse(bu > 0.7 | sc > 0.7 | sr > 6, "unrealistic", "realistic")]