Search code examples
artificial-intelligencealpha-beta-pruning

Unexpected path dependence in alpha-beta search?


I'm writing an artificial intelligence for the old Norse tafl family of board games (project here, source file at issue here). They're close enough to chess in broad strokes for knowledge of chess AI to apply here. (The variant in question is played on a 7x7 board with a radially symmetric starting position, white starting in the middle and black starting at the edge.) I'm running into a curious problem with how I've implemented alpha-beta search: the result of a search to a fixed depth, with no optimizations enabled besides alpha-beta pruning, changes depending on the order in which nodes are explored.

In the file at issue, the important methods are 'explore', 'exploreChildren', 'handleEvaluationResults', and 'generateSuccessorMoves'. 'explore' checks to see if there's a transposition table hit (disabled elsewhere for this test), evaluates the state if it's a victory or a leaf node, or calls exploreChildren. exploreChildren does the recursive searching on child nodes. generateSuccessorMoves generates (and optionally sorts) the moves exiting the current state. handleEvaluationResults determines whether a child evaluation has caused a cutoff.

So, I wrote a minimal test case: generateSuccessorMoves first does no sorting whatsoever, then simply shuffles the list of moves rather than sort it. The results of the search are not equivalent in result, nor in result considering symmetry, nor in value:

MAIN SEARCH
# cutoffs/avg. to 1st a/b a/b
Depth 0: 0/0 0/0
Depth 1: 0/22 0/1
Depth 2: 42/0 3/0
Finding best state...
Best move: d3-g3 with path...
    d3-g3
    e1-f1
    e4-e1xf1
End of best path scored -477
Observed/effective branching factor: 23.00/9.63
Thought for: 72msec. Tree sizes: main search 893 nodes, continuation search: 0 nodes, horizon search: 0 nodes
Overall speed: 12402.77777777778 nodes/sec
Transposition table stats: Table hits/queries: 0/0 Table inserts/attempts: 0/0
1. move: d3-g3 value: -477
Using 5000msec, desired 9223372036854775807
Depth 3 explored 1093 states in 0.037 sec at 29540.54/sec
MAIN SEARCH
# cutoffs/avg. to 1st a/b a/b
Depth 0: 0/0 0/0
Depth 1: 0/21 0/2
Depth 2: 104/0 2/0
Finding best state...
Best move: d3-f3 with path...
    d3-f3
    d2-c2
    d5-f5xf4
End of best path scored -521
Observed/effective branching factor: 23.00/10.30
Thought for: 37msec. Tree sizes: main search 1093 nodes, continuation search: 0 nodes, horizon search: 0 nodes
Overall speed: 29540.540540540544 nodes/sec
Transposition table stats: Table hits/queries: 0/0 Table inserts/attempts: 0/0
7. move: d3-f3 value: -521

This is an extreme case, obviously, but it's my understanding that alpha-beta in this situation (that is, without any feature besides 'alpha-beta pruning') should be stable no matter what the order of the search is—at the very least, it should return a node of the same value. Am I wrong? Am I doing something wrong?

First edit: although I suppose it's obvious from the description of this problem, it turns out that there is some as-yet unknown bug in my alpha-beta implementation. Further testing shows that it does not provide the same result as pure minimax.

Second edit: this is the pseudocode version of the alpha-beta search implemented in the file linked above.

explore(depth, maxDepth, alpha, beta)
    // some tafl variants feature rules where one player moves more than once in a turn
    // each game tree node knows whether it's maximizing or minimizing
    var isMaximizing = this.isMaximizing()
    var value = NO_VALUE

    if(isTerminal(depth, maxDepth))
        value = eval()
    else
        for(move in successorMoves)
            if(cutoff) break

            nodeValue = nextNode(move).explore(depth + 1, maxDepth, alpha, beta)
            if(value == NO_VALUE) value = nodeValue

            if(isMaximizing)
                value = max(value, nodeValue)
                alpha = max(alpha, value)
                if(beta <= alpha) break
            else
                value = min(value, nodeValue)
                beta = min(beta, value)
                if(beta <= alpha) break


rootNode.explore(0, 5, -infinity, infinity)

Solution

  • It turns out it was my fault. I have a bit of code which recursively revalues the nodes above a certain node, for use in extension searches, and I was calling it in the wrong place (after exploring all the children of any node). That early back-propagation was causing incorrect alpha and beta values, and therefore early cutoffs.