I am trying to explore the area of optimization using GAs in R. I realize that R is probably not the best place for me to do this but that is what I know right now.
So here is the problem: I have a set of people that I can pick to build my house most efficiently. I am willing to spend a maximum of 100,000 for the job (the limit beyond which the solution is no longer valid) and I HAVE to pick one electrician, two plumbers, two general construction workers, and two cleanup guys from a list of 500 or so (total 7 guys for the job).
Each of these people have a rate they will charge for the job and overall value they will provide towards the job at hand (the sum total value needs to be maximized). See the data here:
I have looked at R packages genalg and DeOptim where you can use the binary functions to select people that will maximize the overall value for a given price BUT I cant figure out how to put a limit on what type of worker is selected. The solution can be all electricians and while they all provide great value, they cant complete the job. So basically I am looking for a way to control the population and none of these packages seem to allow that.
Does any have ideas on how i can solve this problem? I am sure many of you have looked at variations of this problem and have excellent insight. Thanks a lot in advance for your help.
Here is the code I have obtained from forums here:
library(genalg)
iter = 10
population = 500
# import csv file
brank = read.csv("GA_Data.csv")
dataset <- data.frame(item = brank$Person.ID, survivalpoints = brank$Value, weight = brank$Labor.Cost)
weightlimit <- 100000
monitor <- function(obj) {
minEval = min(obj$evaluations);
plot(obj$mean, obj$best, type="p", main = obj$iter);
}
evalFunc <- function(x) {
current_solution_survivalpoints <- x %*% dataset$survivalpoints
current_solution_weight <- x %*% dataset$weight
if (current_solution_weight > weightlimit)
return(0) else return(-current_solution_survivalpoints)
}
GAmodel <- rbga.bin(size = length(brank$Person.ID), popSize = population, iters = iter, mutationChance = 0.01,
monitorFunc = monitor, elitism = T, zeroToOneRatio = 200, evalFunc = evalFunc)
cat(summary.rbga(GAmodel))
########################
# STOP and replace "space" with "comma" in a text file
solution = c(0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0)
# dataset[solution == 1, ]
brank[solution == 1, ]
sum(brank$Value[solution == 1])
sum(brank$Labor.Cost[solution == 1])
Your problem is a Linear Programming (LP) task and should be solved as such. R has several packages for linear programming, here I will use lpSolve
.
I put your data into a data frame D
(299 workers).
str(D)
# 'data.frame': 299 obs. of 4 variables:
# $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
# $ Type : Factor w/ 4 levels "Cleanup Guy",..: 4 4 4 4 4 4 4 4 4 4 ...
# $ Cost : int 5100 3900 5000 4500 6700 5000 4000 3500 7500 4500 ...
# $ Value: int 25 18 23 20 29 21 17 15 32 19 ...
# Prepare constraint matrix
A <- matrix(0, nrow = 5, ncol = 299)
A[1, c(1:27, 279:299)] <- 1 # Plumbers
A[2, 28:97] <- 1 # Electricians
A[3, 98:190] <- 1 # Const Workers
A[4, 191:278] <- 1 # Cleanup
A[5, ] <- D$Cost # cost <= 100000
# Prepare input for LP solver
objective.in <- D$Value
const.mat <- A
const.dir <- c(">=", ">=", ">=", ">=", "<=")
const.rhs <- c(1, 2, 2, 2, 100000)
# Now solve the problem
require(lpSolve)
sol <- lp(direction = "max", objective.in, # maximize objective function
const.mat, const.dir, const.rhs, # constraints
all.bin = TRUE) # use binary variables only
# View the solution
sol
## Success: the objective function is 589
sum(D$Cost[inds])
## 100000
inds <- which(sol$solution == 1)
D[inds, ]
## ID Type Cost Value
## 1 1 Plumber 5100 25
## 51 51 Electrician 5000 24
## 54 54 Electrician 3500 16
## 101 101 Const Worker 3500 18
## 102 102 Const Worker 4400 21
## 103 103 Const Worker 5800 27
## 147 147 Const Worker 5600 29
## 149 149 Const Worker 3700 17
## 196 196 Cleanup Guy 6200 30
## 197 197 Cleanup Guy 4200 20
## 256 256 Cleanup Guy 4000 27
## 263 263 Cleanup Guy 3800 22
## 264 264 Cleanup Guy 3800 33
## 266 266 Cleanup Guy 3700 23
## 267 267 Cleanup Guy 4200 30
## 273 273 Cleanup Guy 3600 35
## 274 274 Cleanup Guy 5900 28
## 275 275 Cleanup Guy 3700 28
## 282 282 Plumber 7000 37
## 287 287 Plumber 3800 30
## 297 297 Plumber 3600 37
## 298 298 Plumber 5900 32
For explanation of the inputs see the help page of the lp()
function.