Search code examples
pythonmathematical-optimizationcombinatorics

Combinatorial minimisation


I'm trying to find the most efficient way to do this.

Students can be assigned to stops A, B, or C. We are given a list of stops which students are eligible to be assigned to.

##Create an empty dataframe
df = pd.DataFrame()

## Create a list with potential stops that each student qualifies for

potential_stops_for_each_student = [['A','B','C'], ['B','C'], ['C']]
df['Potential Stops'] = potential_stops_for_each_student

My goal here is to minimize the total number of stops, so clearly all students should be assigned to bus stop 'C' in this example. My solution currently looks at all possible combinations (i.e. (A, B, C), (A,C,C), (B,B,C), (B,C,C),(C,B,C), (C,C,C)), and then chooses the combination with the least number of unique elements.

This seems like an awful solution to me, so I was hoping someone could point me towards a better solution. A reference to a package or any existing literature would do, you don't have to code if you don't want!

Thank you!


Solution

  • Here's a quick stab at it using cvxpy with the GLPK solver: you will need to install it as well as cvxpy using the instructions here.

    #import the packages:
    import numpy as np
    import cvxpy
    
    #we will set the number of stops and children here:
    nstops = 10
    nchildren = 50
    
    #data looks like a 1/0 array for each child, for each stop
    #you can use pd.get_dummies to do this
    data = np.random.randint(2, size=(nchildren, nstops))
    
    #here we set our cvxpy variable: this varies to find the best solution
    #it will be the boolean to determine which stops we chose
    selection = cvxpy.Variable(nstops, boolean=True)
    
    #now we make our constraints
    #for each child, the sum of the product of the chosen stops and their available stops must be >=1
    #we can probably use a better formulation of this, but it works
    constraints = []
    for i in range(nchildren):
        #we use cvxpy operations to allow the problem to be formulated
        constraints.append(cvxpy.sum(cvxpy.multiply(data[i,:],selection )) >= 1)
       
    #now our problem is: minimize the sum of the number of chosen stops:
    cost = cvxpy.sum(selection)
    
    #we want to minimize it, subject to our constraints
    problem = cvxpy.Problem(cvxpy.Minimize(cost), constraints=constraints)
    
    #then we solve it!
    score = problem.solve(solver=cvxpy.GLPK_MI)
    
    #we have our 'selection' as a numpy array of booleans: here we get the indices of the chosen stops
    np.where(selection.value.astype(np.bool))