I'm just starting to learn R and have run into something that I'm not sure how to handle in code.
I'm creating a data.frame with a pool of individuals who are available to be assigned to a project. The project needs one BA, one PM, two SA, and one additional person that can be either SA or BA. Each person has a rating and a cost associated with them, I need the max rating while keeping the cost below a certain threshold.
I'm unsure how to acheive the bolded part of the above scenario.. The code below is working but doesn't account for the additional BA/SA.
(This is self-study.. not assigned homework)
EDIT-Desired output where the last row can be of either the SA or BA position.
name position rating cost BA PM SA
Matt SA 95 9500 0 0 1
Aaron BA 85 4700 1 0 0
Stephanie SA 95 9200 0 0 1
Molly PM 88 5500 0 1 0
Jake SA 74 5300 0 0 1
Code:
#load libraries
library(lpSolve)
# create data.frame
name = c("Steve", "Jeremy", "Matt", "Aaron", "Stephanie", "Molly", "Jake", "Tony", "Jay", "Katy", "Alison")
position = c("BA", "PM", "SA", "BA", "SA", "PM", "SA", "SA", "PM", "BA", "SA")
rating = c(75, 90, 95, 85, 95, 88, 74, 81, 55, 65, 68)
cost = c(5000, 8000, 9500, 4700, 9200, 5500, 5300, 7300, 3300, 4100, 4400)
df = data.frame(name, position, rating, cost)
# create restrictions
num_ba = 1
num_pm = 1
num_sa = 2
max_cost = 35000
# create vectors to constrain by position
df$BA = ifelse(df$position == "BA", 1, 0)
df$PM = ifelse(df$position == "PM", 1, 0)
df$SA = ifelse(df$position == "SA", 1, 0)
# vector to optimize against
objective = df$rating
# constraint directions
const_dir <- c("=", "=", "=", "<=")
# matrix
const_mat = matrix(c(df$BA, df$PM, df$SA, df$cost), 4, byrow=TRUE)
const_rhs = c(num_ba, num_pm, num_sa, max_cost)
#solve
x = lp("max", objective, const_mat, const_dir, const_rhs, all.bin=TRUE, all.int=TRUE)
print(df[which(x$solution==1), ])
if I got the question right, this could work:
library(lpSolve)
# create data.frame
name = c("Steve", "Jeremy", "Matt", "Aaron", "Stephanie", "Molly", "Jake", "Tony", "Jay", "Katy", "Alison")
position = c("BA", "PM", "SA", "BA", "SA", "PM", "SA", "SA", "PM", "BA", "SA")
rating = c(75, 90, 95, 85, 95, 88, 74, 81, 55, 65, 68)
cost = c(5000, 8000, 9500, 4700, 9200, 5500, 5300, 7300, 3300, 4100, 4400)
df = data.frame(name, position, rating, cost)
# create restrictions
num_pm = 1
min_num_ba = 1
min_num_sa = 2
tot_saba = 4
max_cost = 35000
# create vectors to constrain by position
df$PM = ifelse(df$position == "PM", 1, 0)
df$minBA = ifelse(df$position == "BA", 1, 0)
df$minSA = ifelse(df$position == "SA", 1, 0)
df$SABA = ifelse(df$position %in% c("SA","BA"), 1, 0)
# vector to optimize against
objective = df$rating
# constraint directions
const_dir <- c("==", ">=", "<=", "==", "<=")
# matrix
const_mat = matrix(c(df$PM, df$minBA, df$minSA, df$SABA, df$cost), 5, byrow=TRUE)
const_rhs = c(num_pm, min_num_ba,min_num_sa, tot_saba, max_cost)
#solve
x = lp("max", objective, const_mat, const_dir, const_rhs, all.bin=TRUE, all.int=TRUE)
print(df[which(x$solution==1), ])
what I'm doing is modifying some constraints and adding a new one: number of BAs must be >= 1. Number of SA >= 2, and the sum of BA and SA must be 4, so that you always select 5 people.
This however gives a different solution than what wrote by the OP:
name position rating cost PM minBA minSA SABA
1 Steve BA 75 5000 0 1 0 1
3 Matt SA 95 9500 0 0 1 1
4 Aaron BA 85 4700 0 1 0 1
5 Stephanie SA 95 9200 0 0 1 1
6 Molly PM 88 5500 1 0 0 0
However, summing the rating of this solution gives 438, while the op result is 437, so this should be correct.
HTH.