In my minimax algorithm, when the computer is presented with a player having two ways to win the computer then will just pick the first open position of the board. Take for example below. X can win in position 0,2 and position 1,0.
X | |
__________
| x |
__________
x | o | o
Currently my algorithm will place the o in the position 0,1. I believe it does this because as minimax runs and places an o in position 0,1 and because this is not a win, it calls minimax again, this time for x. X then moves in position 0,2 for a win. This returns -10 for this position. If the computer moves in position 0,2, minimax is called and x is eventually placed in position 1,0 which also returns a -10 for this move. In fact no matter where the computer places the o, -10 is returned since no matter what the player will win. Because for every position o is placed it returns -10, the computer places the o in the first available slot, which is 0,1 as max is never updated from the first position. I would like it to place the o in position 1,0 or 0,2 just to show that it recognizes a block.
My algorithm is below. It is for a 3x3x3 but the concept is the same.
public int MiniMax(int pGameState[][][], int Depth, boolean IsMax){
FunctionCalls++;
if(CheckForWin(2, pGameState)){ //Max Player (since the computer is always 2)
return 10 - Depth;
}
if(CheckForWin(1, pGameState)){ //Player will win therefore we return -10. If this is the first level of the tree
//then the value return is -10. If the second ply then the value returned is -8.
//It is more important for the computer to win sooner than later.
return -10 - Depth;
}
if(Depth >= 2){
return 0;
}
if(IsMax){
int Value = Integer.MIN_VALUE;
for(int i=0; i<3; i++){
for(int j=0; j<3; j++){
for(int k=0; k<3; k++){
if(pGameState[i][j][k] == 0){
pGameState[i][j][k] = 2;
int best = MiniMax(CopyArray(pGameState), Depth+1, !IsMax);
if(best > Value)
Value = best;
pGameState[i][j][k] = 0;
}
}
}
}
return Value;
}
else{
int Value = Integer.MAX_VALUE;
for(int i=0; i<3; i++){
for(int j=0; j<3; j++){
for(int k=0; k<3; k++){
if(pGameState[i][j][k] == 0){
pGameState[i][j][k] = 1;
int best = MiniMax(CopyArray(pGameState), Depth+1, !IsMax);
if(best < Value)
Value = best;
pGameState[i][j][k] = 0;
}
}
}
}
return Value;
}
}
I initially call minimax this way
best = MiniMax(CopyArray(GameState), 0, false);
I then compare best with my previous Max. If best is larger I save this move as my computer's move.
One simple way to deal with the first-available-move-chosen problem is to order the valid moves before iterating over them. Consider the position you described in the question:
X . .
. X .
X O O
Here O
is to move. Before iterating over the board in the default way (left-to-right top-to-bottom), order the vector of four valid moves ((0, 1), (0, 2), (1, 0), (1, 2))
according to how good each move is. One way to do this is to use the evaluation function that will count how many threats each side has after the potential move is made. A threat for piece P
(which can be X
or O
) is a row, column or diagonal that has one empty square and two P
squares (so it's one piece P
short from becoming a winning line). Let's see what this eval function will tell us for each of the four valid moves for the given position. We count number of threats for both pieces and assign to position the value S
equal to the difference O_threats - X_threats
.
If O
makes a (0, 1)
move, then O_threats = 0
, X_threats = 2
, so the score S = 0 - 2 = -2
.
If O
makes a (0, 2)
move, then O_threats = 1
, X_threats = 1
, so the score S = 1 - 1 = 0
.
If O
makes a (1, 0)
move, then O_threats = 0
, X_threats = 1
, so the score S = 0 - 1 = -1
.
If O
makes a (1, 2)
move, then O_threats = 1
, X_threats = 2
, so the score S = 1 - 2 = -1
.
Based on the calculated scores, the order of visiting the valid moves should be as follows: (0, 2), (1, 0), (1, 2), (0, 1)
. We know that all four moves are losing moves given the perfect play. And since their scores are equal (to the loss value -10
), the first considered move (0, 2)
won't be overwritten by the next ones. This will make the moves of the program "more intelligent", because it now respects the threats created/blocked by the move made (and threat considerations are often used by humans when playing tic-tac-toe). You can enforce different order of visiting the valid moves by using different evaluation function to sort them.
Also note that move ordering can be very useful for increasing the search depth when combined with alpha-beta pruning, because it allows to consider good valid moves first and increases the chance of pruning more nodes. Although alpha-beta pruning might be an overkill for a such simple game, it can be really useful for more complex games.