Search code examples
pythonminimax

Minimax function for board game


def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    def max_value(board, depth):
        if utils.is_game_over(board) or depth == max_depth: #if game is over or depth is max_depth return current score
            return evaluate(board), None  # Return v along with action
        v = float('-inf')
        best_action = None  # Initialize best action
        for action in generate_valid_moves(board): #for each action in valid moves
            next_board = utils.state_change(board, action[0], action[1], False) #generate next state
            next_board = utils.invert_board(next_board, False) #invert the board for the white turns(because all the function is applied for black), so invert board will flip the board and turn white to black
            score, _ = max_value(next_board, depth + 1) #get the score for the next state
            if score > v:
                v = score
                best_action = action  # Update best action
        return v, best_action

    return max_value(board, depth)

I have try to implement a minmax algo to get the optimal movement for black or for white with the max depth. As i run it with my testcase, it just return the first action instead of the optimal action which result in the highest evaluation among all. I cant figure out the problem in my code..

def minimax(
        board: Board, 
        depth: int, 
        max_depth: int, 
        is_black: bool
    ) -> tuple[Score, Move]:
    """
    Finds the best move for the input board state.
    Note that you are black.

    Parameters
    ----------
    board: 2D list of lists. Contains characters "B", "W", and "_",
    representing black pawn, white pawn, and empty cell, respectively.

    depth: int, the depth to search for the best move. When this is equal
    to `max_depth`, you should get the evaluation of the position using
    the provided heuristic function.

    max_depth: int, the maximum depth for cutoff.

    is_black: bool. True when finding the best move for black, False
    otherwise.

    Returns
    -------
    A tuple (evalutation, ((src_row, src_col), (dst_row, dst_col))):
    evaluation: the best score that black can achieve after this move.
    src_row, src_col: position of the pawn to move.
    dst_row, dst_col: position to move the pawn to.
    """
    if depth == max_depth or utils.is_game_over(board):
        return evaluate(board), None

    # Determine the best move and its evaluation
    if is_black:
        best_evaluation = float('-inf')
        best_move = None
        for action in generate_valid_moves(board):
            new_board = utils.state_change(board, action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(new_board, depth + 1, max_depth, False)
            if opponent_evaluation > best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])
        return best_evaluation, best_move
    else:
        best_evaluation = float('inf')
        best_move = None
        for action in generate_valid_moves(utils.invert_board(board, in_place=False)):
            new_board = utils.state_change(utils.invert_board(board, in_place=False), action[0], action[1], in_place=False)
            opponent_evaluation, _ = minimax(utils.invert_board(new_board, in_place=False), depth + 1, max_depth, True)
            if opponent_evaluation < best_evaluation:
                best_evaluation = opponent_evaluation
                best_move = (action[0], action[1])  # Convert from black's perspective to white's
        return best_evaluation, best_move

I just try a different approach where switch cases when each turn of black and white, by using the internal implementation invert table. So it basically pass the public test case for the solution, but it encounter issues that state that "Make sure that 'evaluate' is called with the correct input, especially for white" which i dont understand why? I thought my evaluate correctly evaluate for white and black as I have inverted the board after white turn.


Solution

  • Assuming the functions you call are all correct (like invert_board, ...), there is still this one problem:

    Although the board is inverted, you've not inverted the score. What is good for white is bad for black, and vice versa, so you should invert the score.

    I'd change this:

    score, _ = max_value(next_board, depth + 1)
    

    to this:

    score = -max_value(next_board, depth + 1)[0]
    

    Some unrelated comments:

    • I would have the depth count down instead of up. That way you can compare with 0 (for the base case), and don't need access to max_depth in this function. The initial caller of minimax should then only pass max_depth for the depth parameter, and it can do without the max_depth parameter.

    • invert_board could have quite a negative impact on running times. Once you get this working, you may get performance improvements by making all your functions (like evaluate) aware of who's turn it is, and have them return a value from the viewpoint of that player.

    • Related to the above point: The minimax function has a is_black parameter. I hope you have dealt with that information in the code you have not shared, i.e. make sure to invert the board if is_black is False before calling your max_value function.

    • state_change could also have a negative impact on running times. If so, you can improve on that by mutating board (so not making a copy), and then create a function that can perform an undo of a previous move. You'd call this after having made the recursive call.

    • The base case if statement could perform the two conditions in opposite order: that will save you a call of is_game_over when the maximum depth has been reached.

    • Related to this: is_game_over could also return the evaluation score in case the game is over. So it could either return a score (indicating the game is over) or None (indicating the game is not over). That can be used to then save a separate call to evaluate, which probably would have to do the same game-over checks again.

    • Consider implementing alpha-beta pruning: it may help to reduce the number of states to analyse, while it still ensures the same score is returned.