I have a graph containing sum nodes, choice nodes and point nodes. I'd like to maximize the number of points at the root of the graph.
Here's an example (graphviz):
There are four possible choices here, which result in the following number of points:
So the maximum value of this tree is 5, achieved by choosing 1=y and 2=e. Notably this is less than what you'd get if you took the maximum value at each choice node. That would yield 6 points (3 + 1 + 2) but would require inconsistent choices (1=y/2=e on the left subtree, 1=x/2=a on the right subtree).
This is a small example, but in practice there are 10-20 choices to be made and 5-10 options for each choice: too many to enumerate exhaustively.
My questions are:
Here's a gist with an example tree and Python code to load it. Some stats on the example:
This tree is still relatively small, but it gives a flavor of what these look like.
What have I tried so far? Many things, but a few highlights:
So I've tried a lot! Between these I'm able to get a 1,000+x speedup over exhaustive search, depending on the tree. But I still have the sneaking sense that there's some better way to think about this (dynamic programming?), or a known algorithm that this maps onto.
The problem is NP-hard. Suppose you have a 3-CNF formula with $m$ clauses on variables $x_1,\dots,x_n$. We'll build a tree whose root can achieve $m$ points iff the formula is satisfiable, otherwise its root can only achieve $\le m-1$ points. Given any 3-CNF clause, it is easy to build a subtree of depth 3 (with 4 leaves) whose value will be 1 if the clause is satisfied and will be 0 if the clause is not satisfied. (Each choice node is numbered from 1 to $n$; a choice node with number $i$ corresponds to variable $x_i$.). Then the root node is a sum node with $m$ children; each child of the root is a subtree corresponding to a clause of the formula.
So you should not expect any efficient algorithm that is always correct and scales well to large problem instances.
In practice you could try to solve this with an integer linear programming (ILP) solver. For each separate choice node, add a zero-or-one integer variable. Also add a variable for each node of the tree expressing the value on that node. Then you can write constraints that express how the value on each node relates to the values on its children, and use the ILP solver to maximize the value on the root node. Whether this will work well enough will depend on your problem instance; the only way to know is to try it and see.