I'm trying to determine the optimal search case to compare to a search algorithm I have written.
I have a set of nodes marked "required" and a node marked "start", the rest are marked "optional". I want to find the optimal number of nodes I would need to expand to discover all of the required nodes given that I my first expand is the "start" node.
A few notes:
I have a set of nodes marked "required" and a node marked "start", the rest are marked "optional". I want to find the optimal number of nodes I would need to expand to discover all of the required nodes given that I my first expand is the "start" node.
If the cost of expanding a node can be arbitrary, then this is the node-weighted Steiner tree problem, which, under a plausible complexity-theoretic assumption, has no polynomial-time approximation algorithm with a ratio that's o(log n).
I believe what I am looking for is the minimum spanning tree, but pruned of all branches that do not end in a "required" node.
No, this is in general not optimal. For example, with the graph
s
/|\
/ | \
* | *
/ | \
/ | \
r1----*----r2,
one possible MST looks like /|\
or /\
when pruned, but the optimal solution looks like _|_
.
What if anything can I say about the size of the tree?
Theoretically speaking, you could get a lower bound by solving the dual of the LP relaxation of the integer program for Steiner tree (actually with a graph of a the size you're thinking about, it wouldn't surprise me if a solver could determine the optimal Steiner tree straight up).
Practically speaking, however, this is not how people evaluate search algorithms.