java algorithm mathematical-optimization combinatorics

Combinatorial optimization with interdependent variables

This is my first post on these forums, thanks in advance for all responses.

I am developing a java application in which I have encountered what I believe is termed a “Combinatorial Optimization Problem”. I have only basic math skills, so trying to investigate the setup of such a problem have not been fruitful so far.

Basically, what I would like to do is to program the most efficient way of finding the optimal subset n of a larger set N with variables v1, v2, v3 etc. The variables all have a definite value/score in addition to a value/score that is dependant on certain other variables that may or may not be included in the subset.

I am interested in selecting the subset which gives the maximum total value/score.

So, for example, the full set N consists of the following variables and have the following definite values as well as relations to the other variables:

v1  8   { v2 ; v8 }
v2  6   { v1 ; v4 }
v3  9   { }
v4  7   { v2 ; v5 ; v8 }
v5  6   { v4 ; v10 }
v6  8   { v7 }
v7  5   { v6 }
v8  9   { v1 ; v4 }
v9  6   { } 
v10 7   { v5 }

Having a relation to one or more other chosen variable means that the total value will have a certain “take-off” – for the sake of this example let’s say one for each relation. So selecting the first five variables as a subset will give a total value of 30:

v1  8   { v2 ; v8 }      = 8 - 1 = 7
v2  6   { v1 ; v4 }      = 6 - 2 = 4
v3  9   { }              = 9 - 0 = 9
v4  7   { v2 ; v5 ; v8 } = 7 - 2 = 5
v5  6   { v4 ; v10 }     = 6 - 1 = 5

This is of course not a problem for such small sets, but I am currently facing sets of 100K and subsets of 10K – with the current algorithm, my computer calculated the solution in 3 days!

I do not necessarily need a code for solving this, but rather the optimal mathematical way to do it (if there are any!). Keep in mind though that I’m having a hard time understanding mathematical notation above basic level.

Again, thanks in advance for all responses!

Solution

Unfortunately this problem is NP-hard to find the optimal solution.

If you could solve this problem efficiently then you could solve the NP-hard maximum independent set problem by setting the value for each vertex to 1, and a very large penalty for each edge.

So you should instead look for approximate solutions. You may find you get a reasonable answer with simulated annealing or genetic algorithms.