Search code examples
rmathematical-optimizationlpsolve

Finding optimal project team using R


I'm just starting to learn R and have run into something that I'm not sure how to handle in code.

I'm creating a data.frame with a pool of individuals who are available to be assigned to a project. The project needs one BA, one PM, two SA, and one additional person that can be either SA or BA. Each person has a rating and a cost associated with them, I need the max rating while keeping the cost below a certain threshold.

I'm unsure how to acheive the bolded part of the above scenario.. The code below is working but doesn't account for the additional BA/SA.

(This is self-study.. not assigned homework)

EDIT-Desired output where the last row can be of either the SA or BA position.

name     position rating cost   BA PM SA  
Matt        SA     95    9500   0  0  1    
Aaron       BA     85    4700   1  0  0    
Stephanie   SA     95    9200   0  0  1    
Molly       PM     88    5500   0  1  0    
Jake        SA     74    5300   0  0  1

Code:

#load libraries
library(lpSolve)

# create data.frame
name = c("Steve", "Jeremy", "Matt", "Aaron", "Stephanie", "Molly", "Jake", "Tony", "Jay", "Katy", "Alison") 
position = c("BA", "PM", "SA", "BA", "SA", "PM", "SA", "SA", "PM", "BA", "SA")
rating = c(75, 90, 95, 85, 95, 88, 74, 81, 55, 65, 68) 
cost = c(5000, 8000, 9500, 4700, 9200, 5500, 5300, 7300, 3300, 4100, 4400)
df = data.frame(name, position, rating, cost)

# create restrictions
num_ba = 1
num_pm = 1
num_sa = 2
max_cost = 35000

# create vectors to constrain by position
df$BA = ifelse(df$position == "BA", 1, 0)
df$PM = ifelse(df$position == "PM", 1, 0)
df$SA = ifelse(df$position == "SA", 1, 0)

# vector to optimize against
objective = df$rating

# constraint directions
const_dir <- c("=", "=", "=", "<=")

# matrix
const_mat = matrix(c(df$BA, df$PM, df$SA, df$cost), 4, byrow=TRUE)
const_rhs = c(num_ba, num_pm, num_sa, max_cost)

#solve
x = lp("max", objective, const_mat, const_dir, const_rhs, all.bin=TRUE, all.int=TRUE)
print(df[which(x$solution==1), ])

Solution

  • if I got the question right, this could work:

     library(lpSolve)
    
     # create data.frame
     name = c("Steve", "Jeremy", "Matt", "Aaron", "Stephanie", "Molly", "Jake", "Tony", "Jay", "Katy", "Alison") 
    position = c("BA", "PM", "SA", "BA", "SA", "PM", "SA", "SA", "PM", "BA", "SA")
    rating = c(75, 90, 95, 85, 95, 88, 74, 81, 55, 65, 68) 
    cost = c(5000, 8000, 9500, 4700, 9200, 5500, 5300, 7300, 3300, 4100, 4400)
    
    df = data.frame(name, position, rating, cost)
    
    # create restrictions
    num_pm = 1
    min_num_ba = 1
    min_num_sa = 2
    tot_saba   = 4
    max_cost = 35000
    
    # create vectors to constrain by position
    df$PM = ifelse(df$position == "PM", 1, 0)
    df$minBA = ifelse(df$position == "BA", 1, 0)
    df$minSA = ifelse(df$position == "SA", 1, 0)
    df$SABA = ifelse(df$position %in% c("SA","BA"), 1, 0)
    
    # vector to optimize against
    objective = df$rating
    
    # constraint directions
    const_dir <- c("==", ">=", "<=", "==", "<=")
    
    # matrix
    const_mat = matrix(c(df$PM, df$minBA, df$minSA, df$SABA, df$cost), 5, byrow=TRUE)
    const_rhs = c(num_pm, min_num_ba,min_num_sa, tot_saba, max_cost)
    
    #solve
    x = lp("max", objective, const_mat, const_dir, const_rhs, all.bin=TRUE, all.int=TRUE)
    print(df[which(x$solution==1), ])
    

    what I'm doing is modifying some constraints and adding a new one: number of BAs must be >= 1. Number of SA >= 2, and the sum of BA and SA must be 4, so that you always select 5 people.

    This however gives a different solution than what wrote by the OP:

           name position rating cost PM minBA minSA SABA
    1     Steve       BA     75 5000  0     1     0    1
    3      Matt       SA     95 9500  0     0     1    1
    4     Aaron       BA     85 4700  0     1     0    1
    5 Stephanie       SA     95 9200  0     0     1    1
    6     Molly       PM     88 5500  1     0     0    0
    

    However, summing the rating of this solution gives 438, while the op result is 437, so this should be correct.

    HTH.