This is a follow-up question to: Finding max value of a weighted subset sum of a power set Whereas the previous question solves (to optimality) problems of size <= 15 in reasonable time, I would like to solve problems of size ~2000 to near-optimality.
As a small example problem, I am given a certain range of nodes:
var range = [0,1,2,3,4];
A function creates a power set for all the nodes in the range and assigns each combination a numeric score. Negative scores are removed, resulting in the following array S
. S[n][0]
is the bitwise OR of all included nodes, and S[n][1]
is the score:
var S = [
[1,0], //0
[2,0], //1
[4,0], //2
[8,0], //3
[16,0], //4
[3,50], //0-1
[5,100], //0-2
[6,75], //1-2
[20,130], //2-4
[7,179] //0-1-2 e.g. combining 0-1-2 has a key of 7 (bitwise `1|2|4`) and a score of 179.
];
The optimal solution, maximizing the score, would be:
var solution = [[8,3,20],180]
Where solution[0]
is an array of combinations from S
. and solution[1]
is the resulting score. Note that bitwise 8 & 3 & 20 == 0
signifying that each node is used only once.
Problem specifics: Each node must be used exactly once and the score for the single-node combinations will always be 0, as shown in the S
array above.
My current solution (seen here) uses dynamic programming and works for small problems. I have seen heuristics involving dynamic programming, such as https://youtu.be/ze1Xa28h_Ns, but can't figure out how I'd apply that to a multi-dimensional problem. Given the problem constraints, what would be a reasonable heuristic to apply?
EDIT: Things I've tried
score
greatest to least, pick the next viable candidate)score/cardinality(combo)
score
by up to 10%, then sort, repeat until a better solution hasn't been found in x seconds)This problem is really an integer optimization problem, with binary variables x_i
indicating if the i
^th element of S
is selected and constraints indicating that each bit is used exactly once. The objective is to maximize the score attained across the selected elements. If we defined S_i
to be the i
^th element of S
, L_b
to be the indices of elements in S
with bit b
set, w_i
to be the score associated with element i
, and assumed there were n
elements in set S
and k
bits, we could write this in mathematical notation as:
min_{x} \sum_{i=1..n} w_i*x_i
s.t. \sum_{i \in L_b} x_i = 1 \forall b = 1..k
x_i \in {0, 1} \forall i = 1..n
In many cases, linear programming solvers are much (much, much) more effective than exhaustive search at solving these sorts of problems. Unfortunately I am not aware of any javascript linear programming libraries (a Google query turned up SimplexJS and glpk.js and node-lp_solve -- I have no experience with any of these and couldn't immediately get any to work). As a result I will do the implementation in R using the lpSolve
package.
w <- c(0, 0, 0, 0, 0, 50, 100, 75, 130, 179)
elt <- c(1, 2, 4, 8, 16, 3, 5, 6, 20, 7)
k <- 5
library(lpSolve)
mod <- lp(direction = "max",
objective.in = w,
const.mat = t(sapply(1:k, function(b) 1*(bitwAnd(elt, 2^(b-1)) > 0))),
const.dir = rep("=", k),
const.rhs = rep(1, k),
all.bin = TRUE)
elt[mod$solution > 0.999]
# [1] 8 3 20
mod$objval
# [1] 180
As you'll note, this is an exact formulation of your problem. However, by setting a timeout (you'd actually need to use the lpSolveAPI
package in R instead of the lpSolve
package to do this), you can get the best solution found by the solver before reaching your specified timeout. This may not be an optimal solution, and you can control how long before the heuristic stops trying to find better solutions. If the solver terminates before the timeout, the solution is guaranteed to be optimal.