Search code examples
rloopslinear-programmingsolvernormal-distribution

Loop through a set of items in R


I'm trying to do a simulation of a linear programming problem. The model has an obj function, and some constraints. In this case, I am introducing 2 values that are randomly drawn from the normal distribution.

I am then simulating the optimization model 10.000 times using a for-loop. I know it is bad R practise to use for-loops, but speed is not my concern in this case.

#List of 10000 random, normally distributed observations:
Demand = rnorm(10000, mean=5, sd=2.5)
Demand

Performance = rnorm(10000, mean=100, sd=6)
Performance

Optimas = NULL

#combined_list = c(Demand, Performance)

for (i in Performance){
    op <- OP(objective = c(2.5, 4, i),               #Performance value[i]: works fine 
             L_constraint(L = matrix(c(1, 0, 0,      #LHS
                                       0, 1, 0,       
                                       0, 0, 1),      
                                      ncol=3, nrow = 3, 
                                      byrow = TRUE), 
                          dir = c("<=", "<=", "<="), 
                          rhs = c(50, 70, Demand)),  #Demand value[i]: should go here
              maximum = TRUE,
              types = c("B", "I", "I"))  

    Optima <- ROI_solve(op, solver = "glpk") #solve
    print(Optima)
    print(i)
    Optimas = rbind(Optimas, Optima)
}


Optimas <- as.data.frame(Optimas)
Optimas$objval <- as.numeric(Optimas$objval)
hist(Optimas$objval)

As seen above, my loop does only go trough one variable (Performance), where i want the Demand value for row(i) in the Demand vector to be put into the model, at the same time as performance value for row(i) in the performance vector.

The overall objective is to have 10.000 simulations of the LP model, where both the performance values, as well as the demand values only occur once(as another loop outside the one I already have, will produce the cartesian product of the two lists).


Solution

  • Note that your Performance and Demand vectors contain exactly the same number of elements. Consequently, you can just simply loop over the vector indices and use the respective index values to extract the relevant elements.

    I can not run your example code because I'm not sure which optimization package you are using for your OP function. As an example, I will define a simple dummyFunction that takes a performance and demand value as input, as follows:

    dummyFunction <- function(perf, dem){ return (perf+dem)}
    

    In your specific use case, the dummyFunction would contain the optimization logic. Next, you can obtain the required solution as follows, by iterating over the vector indices:

    Optimas = vector(mode="numeric", length=length(Performance))
    for(idx in 1:length(Performance)){
      Optimas[idx] <- dummyFunction(Performance[idx], Demand[idx])
    }
    

    Alternatively, you can avoid the function definition by just placing your optimization logic inside the for loop. For a more "R like solution" consider the usage of sapply / lapply type functions:

    Optimas <- sapply(1:length(Performance), function(idx) dummyFunction(Performance[idx], Demand[idx]))
    

    Note that in this particular example you could also just do the following:

    Optimas <- dummyFunction(Performance, Demand)
    

    Finally, if performance becomes an issue, consider using the foreach and doparallel packages to run the optimizations on multiple cores simultaneously:

    library(doParallel)
    library(foreach)
    nrCores <- detectCores()
    cl <- makeCluster(nrCores); registerDoParallel(cl)
    clusterExport(cl,c("Demand", "Performance", "dummyFunction"), envir=environment())
    Optimas <- foreach(idx=1:length(Performance), .combine="rbind") %dopar%{
      dummyFunction(Performance[idx], Demand[idx])
    }
    stopCluster(cl)