Search code examples
cartificial-intelligencetic-tac-toeminimax

Minimax algorithm with TicTacToe not working properly


I already posted a similar question here in this forum but since the old post got a little long and I rewrote my algorithm I'm starting this new post. The old post can be found here.


So I'm simply trying to implement a minimax algorithm for my TicTacToe game, except it turned out to be quite hard and even after days of trying to find the mistake, I cannot find it. you can find my code below. First, I have a few definitions, typedefs and declarations:

typedef signed char s8;
typedef unsigned char u8;
typedef s8 score;

#define STATE_00    getBoardState(0, 0)
#define STATE_01    getBoardState(0, 1)
#define STATE_02    getBoardState(0, 2)
#define STATE_10    getBoardState(1, 0)
#define STATE_11    getBoardState(1, 1)
#define STATE_12    getBoardState(1, 2)
#define STATE_20    getBoardState(2, 0)
#define STATE_21    getBoardState(2, 1)
#define STATE_22    getBoardState(2, 2)

typedef enum {
    EPlayerX = 1,
    EPlayerO
} EPlayer;

typedef enum {
    EMinimizing = 0,
    EMaximizing
} EMinMax;

static u8 g_boardState[3][3] = {
    {0, 0, 0,},
    {0, 0, 0,},
    {0, 0, 0,},
};

These are followed by some functions:

u8 getBoardState(u8 row, u8 column);

EPlayer isWon(void)
{
    EPlayer winningBoards[8][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
        {STATE_00, STATE_10, STATE_20},
        {STATE_01, STATE_11, STATE_21},
        {STATE_02, STATE_12, STATE_22},
        {STATE_00, STATE_11, STATE_22},
        {STATE_20, STATE_11, STATE_02},
    };
    u8 i;
    for(i=0; i<8; i++){
        if( (winningBoards[i][0] != 0) &&
            (winningBoards[i][0] == winningBoards[i][1]) &&
            (winningBoards[i][0] == winningBoards[i][2])){
                return winningBoards[i][0];
        }
    }
    return 0;
}

u8 getBoardState(u8 row, u8 column)
{
    return g_boardState[row][column];
}

void setBoardState(u8 row, u8 column, u8 state)
{
    g_boardState[row][column] = state;
}

u8 isDraw(void)
{
    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                return 0;
            }
        }
    }
    return 1;
}

void dumpTable(score table[3][3])
{
    int i, j;
    for(i=0; i<3; i++) {
        printf("\n");
        for(j=0; j<3; j++){
            printf("%6i ", table[i][j]);
        }
    }
    printf("\n");
}

EPlayer playerSwitch(EPlayer player)
{
    if(player == EPlayerO) return EPlayerX;
    if(player == EPlayerX) return EPlayerO;
    return 0;
}

EMinMax modeSwitch(EMinMax mode)
{
    if(mode == EMaximizing) return EMinimizing;
    if(mode == EMinimizing) return EMaximizing;
    return 0;
}

Then there is the actual minimax algorithm here called scoring:

score scoring(EMinMax mode, EPlayer player, u8 depth)
{
    score thisScore, tempScore;
    if(mode == EMaximizing){
        thisScore = -20;
        if(isWon()) return 15-depth;
    }
    if(mode == EMinimizing){
        thisScore = 20;
        if(isWon()) return depth-15;
    }
    if(isDraw()){
        return 0;
    }

    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                setBoardState(i, j, player);
                tempScore = scoring(modeSwitch(mode), playerSwitch(player), depth+1);
                if((mode == EMaximizing) && (tempScore > thisScore)){
                    thisScore = tempScore;
                }
                if((mode == EMinimizing) && (tempScore < thisScore)){
                    thisScore = tempScore;
                }
                setBoardState(i, j, 0);
            }
        }
    }

    return thisScore;
}

And finally a function printing the scores in a table as well as the main:

void printSocredBoards(EPlayer player)
{   
    score thisScore[3][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
    };
    int i, j;
    if((isWon() == 0) && (isDraw() == 0)){
        for(i=0; i<3; i++){
            for(j=0; j<3; j++){
                if(getBoardState(i, j) == 0){
                    setBoardState(i, j, player);
                    thisScore[i][j] = scoring(EMaximizing, playerSwitch(player), 0);
                    setBoardState(i, j, 0);
                }
            }
        }
    }
    dumpTable(thisScore);
}

int main(int argc, char **argv)
{

    printSocredBoards(EPlayerO);

    return 0;
}

As far as I know, this algorithm should work fine, however it gives me a nonsensical output:

 7      7      7 
 7      0      7 
 7      7      7 

What am I missing? Thanks in advance for any help.


Solution

  • I believe that the problem lies with this chunk of code from scoring where your cases are flipped from the correct return values:

    if(mode == EMaximizing){
        thisScore = -20;
        if(isWon()) return 15-depth;
    }
    if(mode == EMinimizing){
        thisScore = 20;
        if(isWon()) return depth-15;
    }
    

    Intuitively, the problem is that when scoring reaches this point in the code, the call to isWon is evaluating the result of the previous piece placement, which was made with the other choice for mode.

    For example, when scoring is called with EMaximizing and the board state is already won, then that implies that the player who is EMinimizing wins in this state and the returned score should reflect this (i.e. it should be negative). Since depth reaches a maximum of 8 your return value when mode == EMaximizing is always positive, which is not what you want.

    When the cases are reversed, your program outputs all zeroes, which seems more sensible as perfect players should always draw. I also tested the code with the following line added to the top of printScoredBoards to hard-code the first play to the top-left corner:

    setBoardState(0, 0, playerSwitch(player));
    

    This yields the following:

     0     10     10 
    10      0     10 
    10     10     10 
    

    Correctly identifying both that the second player cannot choose the top-left and will lose if they play anything other than the center as their opening move.