I already have tried to modify my algorithm to work better, but I haven't achieved any result. My problem is that after the first moves, if I have, for example:
XX.
OO.
...
The Computer, instead of choosing 0 2, choses for example 1 2 and sometimes tries to go for position it can't.
My code:
#include "game.hpp"
pair<int,int> winner;
int m = INT_MAX;
pair<int,int> game::minimax(State ini) {
int v = maxValue(ini);
cout << v << endl;
return winner;
}
int game::maxValue(State u){
int check = u.getUtility();
if( check % 700 == 0 ) {
if( u.moves < m ) {
winner = u.move;
m = u.moves;
}
return check;
}
int v = INT_MIN;
u.makeDescendents();
while( !u.ls.empty() ) {
v = max(v,minValue(u.ls.front()));
u.ls.pop_front();
}
return v;
}
int game::minValue(State u) {
int check = u.getUtility();
if( check % 700 == 0 )
return check;
int v = INT_MAX;
u.makeDescendents();
while( !u.ls.empty() ) {
v = min(v,maxValue(u.ls.front()));
u.ls.pop_front();
}
return v;
}
For you can help me better I will make clear the meaning of some variables:
winner: is the position the computer will move
u.moves: is the depth on the search tree , for root is 0
m: supposed to save the less depth state solution , for that way filter solutions and computer must play the move more close of solution.
check: save utility value at this moment for known if is a terminal state
utility for win is 700 for tie is 0 and for defeat is -700
u.ls: list of children states
Something else , I think use m and winner global and return a global on minimax is a poor solution , do you can see some way to improve this?
Thanks very much.
First things first, what does u.getUtility()
return if the state is not terminal? If it returns 0, well then 0 % 700 == 0
is true, so it's just finding the first move it expanded and selecting it. Since I can't see the u.makeDescendents()
algorithm, I can't rule this out.
If that's not the case, then almost certainly your u.getUtility()
function is making the assumption that it is only ever being called for the same max player. i.e. It's returning 700 if X wins and -700 if X loses. If you're running both sides through the same minimax, then when you evaluate O as max, it's still trying to find a win for X because that's the only time it will see the evaluation as a win.
If this is the case, the fix is simple, determine which player's turn it is from the state and return the win/loss evaluation as if it were that player (which is typically always a loss in TicTacToe because you cannot make a move which loses you the game, you can only win by making a move and the previous player made the last move).
If neither of these suggestions solve the problem, the typical way to debug minimax problems is to step through the game tree one level deep at a time, exploring the path that is returning known invalid evaluations until you find the point at which is generates an incorrect value. Then you have to inspect it as to find out why. This is trivial for small games like tic tac toe, because it only goes 9 levels deep and you can get a perfect minimax value, but for any non trivial game, you generally have to look at your evaluation function to determine where the discrepancy is occurring.