Search code examples
javaalgorithmtic-tac-toeminimaxalpha-beta-pruning

Java Alpha-Beta Pruning in Tic Tac Toe


I have a game of Tic Tac Toe that uses the Minimax Algorithm. I want to improve that by adding alpha-beta pruning. However the alpha-beta method does not seem to be able to calculate moves effectively. It just puts its piece in the next available space whether it is the optimal move or not. I do not have this issue with the minimax method. I sure it’s something simple that I keep overlooking, so forgive me. I used this tutorial for minimax and this tutorial for alpha-beta pruning.

This is the Minimax class. It includes the alpha-beta method:

public class Minimax {

    private Token playerToken;
    private EndStates endStates;
    private Token opponentToken;

    public Minimax(Token playerToken, EndStates endStates) {
        this.playerToken = playerToken;
        this.endStates = endStates;
        opponentToken = makeOpponentToken();
    }

    public Token makeOpponentToken() {
        if (playerToken == Token.O) {
            return Token.X;
        }
        else {
            return Token.O;
        }
    }

    public Token getOpponentToken() {
        return opponentToken;
    }

    public int evaluate(Cell[] board) {
        //rows across
        if (endStates.checkWinByRow(board, playerToken) || endStates.checkWinByColumn(board, playerToken) || endStates.checkWinByDiagonal(board, playerToken)) {
            return 10;
        }
        else if (endStates.checkWinByRow(board, opponentToken) || endStates.checkWinByColumn(board, opponentToken) || endStates.checkWinByDiagonal(board, opponentToken)) {
            return -10;
        }

        return 0;
    }

    public boolean hasCellsLeft(Cell[] board) {
        for (int i=0; i<board.length; i++) {
            if (board[i].getToken() == Token.EMPTY) {
                return true;
            }
        }
        return false;
    }

    int MAX = 1000;
    int MIN = -1000;

    public int alphaBeta(Cell[] board, int depth, boolean isMax, int alpha, int beta) {
        int score = evaluate(board);
        if (score == 10) {
            return score;
        }

        if (score == -10) {
            return score;
        }
        if (hasCellsLeft(board) == false) {
            return 0;
        }
        if (isMax) {
            int best = MIN;
            for (int i=0; i<board.length; i++) {
                if (board[i].getToken() == Token.EMPTY) {
                    board[i].setToken(playerToken);
                    int val = alphaBeta(board,depth+1, !isMax, alpha, beta);
                    best = Math.max(best, val);
                    alpha = Math.max(alpha, best);
                    board[i].resetMarker();
                }
                 if (best <= alpha) {
                    break;
                }
            }
            return best;
        }
        else {
            int best = MAX;
            for (int i=0; i<board.length; i++) {
                if (board[i].getToken() == Token.EMPTY) {
                    board[i].setToken(playerToken);
                    int val = alphaBeta(board, depth+1, isMax, alpha, beta);
                    best = Math.min(best, val);
                    beta = Math.min(beta, best);
                    board[i].resetMarker();
                }
                if (beta <= alpha) {
                    break;
                }
            }
            return best;
        }
    }

    public int minimax(Cell[] board,  int depth, boolean isMax) {
        int score = evaluate(board);
        int best;

        if (score == 10) {
            return score;
        }

        if (score == -10) {
            return score;
        }
        if (hasCellsLeft(board) == false) {
            return 0;
        }
        if (isMax) {
            best = -1000;
            for (int i=0; i<board.length; i++) {
                if (board[i].getToken() == Token.EMPTY) {
                    board[i].setToken(playerToken);
                    best = Math.max(best, minimax(board, depth+1, !isMax));
                    board[i].resetMarker();
                }
            }
            return best;
        }
        else {
            best = 1000;
            for (int i=0; i<board.length; i++) {
                if (board[i].getToken() == Token.EMPTY) {
                    board[i].setToken(opponentToken);
                    best = Math.min(best, minimax(board, depth+1, !isMax));
                    board[i].resetMarker();
                }
            }
            return best;
        }
    }

    public int findBestMove(Cell[] board) {
        int bestValue = -1000;
        int bestMove = -1;
        for (int i=0; i<board.length; i++) {
            if (board[i].getToken() == Token.EMPTY) {
                board[i].setToken(playerToken);
                //int moveValue = minimax(board, 0, false);
                int moveValue = alphaBeta(board, 0, true, -1000, 1000);
                board[i].resetMarker();
                if (moveValue > bestValue) {
                    bestMove = i;
                    bestValue = moveValue;
                }
            }
        }
        return bestMove;
    }
}

The board is an array of 9 that contains an enum value of Token.Empty but can be replaced with Token.X or Token.O respectively.

This is the class that calls uses the algorithm:

public class ComputerPlayer(Token token, Algorithm minimax ) {
   private Token playerToken;
   private Algorithm minimax;

    public ComputerPlayer(Token playerToken, Algorithm minimax) {
        this.playerToken = playerToken;
        this.minimax = minimax;
    }

    public Token getPlayerToken() {
        return playerToken;
    }

   public void makeMove(Cell[] board) {
        int chosenCell;
        chosenCell = minimax.findBestMove(board);
        board[chosenCell].setToken(playerToken);
        System.out.println("Player " + playerToken + " has chosen cell " + (chosenCell+1));
    }
}

Solution

  • Alpha-Beta Pruning requires a good evaluation function for unfinished game states to be effective. It should be able to evaluate when one player is "more likely" to win accurately. It will use the evaluation to prune variations that do not look promising.

    Right now your evaluation function only differentiates between game over and game in progress, so you will not be able to do better than min-max.

    However, if you are doing worse than min-max, you must also have a bug somewhere else. I recommend stepping through the code and trying to see where it goes wrong.