Search code examples
algorithmlanguage-agnosticpath-finding

Pathfinding while forcing unique node attributes -- which algorithm should I use?


Update 2011-12-28: Here's a blog post with a less vague description of the problem I was trying to solve, my work on it, and my current solution: Watching Every MLB Team Play A Game


I'm trying to solve a kind of strange pathfinding challenge. I have an acyclic directional graph, and every edge has a distance value. And I want to find a shortest path. Simple, right? Well, there are a couple of reasons I can't just use Dijkstra's or A*.

  1. I don't care at all what the starting node of my path is, nor the ending node. I just need a path that includes exactly 10 nodes. But:
  2. Each node has an attribute, let's say it's color. Each node has one of 20 different possible colors.
  3. The path I'm trying to find is the shortest path with exactly 10 nodes, where each node is a different color. I don't want any of the nodes in my path to have the same color as any other node.
  4. It'd be nice to be able to force my path to have one value for one of the attributes ("at least one node must be blue", for instance), but that's not really necessary.

This is a simplified example. My full data set actually has three different attributes for each node that must all be unique, and I have 2k+ nodes each with an average of 35 outgoing edges. Since getting a perfect "shortest path" may be exponential or factorial time, an exhaustive search is really not an option. What I'm really looking for is some approximation of a "good path" that meets the criterion under #3.

Can anyone point me towards an algorithm that I might be able to use (even modified)?


Some stats on my full data set:

  • Total nodes: 2430
  • Total edges: 86524
  • Nodes with no incoming edges: 19
  • Nodes with no outgoing edges: 32
  • Most outgoing edges: 42
  • Average edges per node: 35.6 (in each direction)
  • Due to the nature of the data, I know that the graph is acyclic
  • And in the full data set, I'm looking for a path of length 15, not 10

Solution

  • It is the case when the question actually contains most of the answer.

    Do a breadth-first search starting from all root nodes. When the number of parallelly searched paths exceeds some limit, drop the longest paths. Path length may be weighed: last edges may have weight 10, edges passed 9 hops ago - weight 1. Also it is possible to assign lesser weight to all paths having the preferred attribute or paths going through the weakly connected nodes. Store last 10 nodes in the path to the hash table to avoid duplication. And keep somewhere the minimum sum of the last 9 edge lengths along with the shortest path.