Search code examples
javarecursionartificial-intelligenceminimaxalpha-beta-pruning

Java Minimax Alpha-Beta Pruning Recursion Return


I am trying to implement minimax with alpha-beta pruning for a checkers game in Java. My minimax algorithm works perfectly. My code runs with the alpha-beta code in place. Unfortunately, when I play 1000 games vs the standard minimax algorithm, the alpha-beta algorithm always comes out behind by 50 games or so.

Since alpha-beta pruning should not be reducing the quality of the moves, just the time it takes to achieve them, something has to be wrong. However, I have taken out pen and paper and drawn hypothetical leaf node values and used my algorithm to predict whether it will calculate the correct best move, and there doesn't appear to be any logic errors. I used the tree from this video: Alpha-Beta Pruning to trace my algorithm. It logically should make all of the same choices, and therefore be a functioning implementation.

I have also put print statements into the code (they have been removed to reduce the clutter), and values are being returned correctly it appears and pruning does happen. Despite my best efforts I have been unable to find where the logic error lies. This is my third different attempt at implementing this and all of them have had the same issue.

I can't post the full code here, it's much too long, so I have included the methods that are relevant to the error. I'm not certain, but I suspect the problem may likely be in the non-recursive move() method, though I can't find a logical error in it so I'd just be thrashing around in it more, probably making things worse rather than better without having a rhyme or reason.

Is there a trick to recovering multiple integer values from recursive calls in a for loop? It works fine with both my minimax and negamax implementations, but alpha-beta pruning seems to produce some strange results.

@Override
public GameState move(GameState state) 
{
    int alpha = -INFINITY;
    int beta = INFINITY;
    int bestScore = -Integer.MAX_VALUE;
    GameTreeNode gameTreeRoot = new GameTreeNode(state);
    GameState bestMove = null;
    for(GameTreeNode child: gameTreeRoot.getChildren())
    {
        if(bestMove == null)
        {
            bestMove = child.getState();
        }
        alpha = Math.max(alpha, miniMax(child, plyDepth - 1, alpha, beta));
        if(alpha > bestScore)
        {
            bestMove = child.getState();
            bestScore = alpha;
        }
    }
    return bestMove;
}

private int miniMax(GameTreeNode currentNode, int depth, int alpha, int beta) 
{
    if(depth <= 0 || terminalNode(currentNode.getState())) 
    {
        return getHeuristic(currentNode.getState());
    }
    if(currentNode.getState().getCurrentPlayer().equals(selfColor))
    {
        for(GameTreeNode child: currentNode.getChildren())
        {
            alpha = Math.max(alpha, miniMax(child, depth - 1, alpha, beta));

            if(alpha >= beta)
            {
                return beta;
            }
        }
        return alpha;
    }
    else
    {
        for(GameTreeNode child: currentNode.getChildren())
        {
            beta = Math.min(beta, miniMax(child, depth - 1, alpha, beta));

            if(alpha >= beta)
            {
                return alpha;
            }
        }
        return beta;
    }
}
//Checks to see if the node is terminal
private boolean terminalNode(GameState state)
{
if(state.getStatus().equals(win) || state.getStatus().equals(lose) || state.getStatus().equals(draw))
    {
        return true;
    }
    else
    {
        return false;
    }
}

Solution

  • You already fixed your problem, but the problem you encountered is pretty common. So whenever you build a part of the algorithm for an AI agent, you have to test it properly. So once your minimax algorithm is correct, you can just generate many random trees and check whether the results are the same. For example in python you can do this in this way:

    class Node():
        def __init__(self, data, children):
            self.data = data
            self.children = children
    
    def generateTree(depth, branching):
        total = branching**depth
        values = [randint(-100, 100) for _ in xrange(total)]
        level = [Node(values[i], []) for i in xrange(total)]
    
        for _ in xrange(depth):
            total /= branching
            level = [Node(None, level[i * branching: (i+1) * branching]) for i in xrange(total)]
    
        return level[0], values
    

    Now you can generate a tree with many random trees and compare the results.

    tree, values = generateTree(depth, branching)
    print negamax(tree, depth, 1) == alpha_beta_negamax(tree, depth, float('-inf'), float('inf'), 1)
    

    Do not forget that minimax and alpha-beta return just the best value, whereas what you are interested in a real game is a move. It is straightforward to modify them in such a way that they can return a move, but this is up to a developer to decide how the move is returned. This is because there can be many moves that lead to the best solution (you can return the first one, last one or the most common one is to find all the moves and to return the random one).

    In your case the problem was with the randomness of the returned values, so during the testing the good approach is to fix randomness.