First off I'm sorry for the slightly incorrect title, I just didn't want it to be 30 words long. The alpha/beta pruning I implemented enormously reduced the amount of evaluations when I applied it to my TicTacToe game, see for yourself below.
Each pair of evaluation counts are measured with the same game state as input.
The problem arises when I want to implement the pruning to the Checkers playing Neural Network I've been working on. Which was the goal of this whole thing to begin with, I just whipped up the TicTacToe game to experiment with MiniMax + Alpha/Beta as I've never dealt with these algorithms before.
Here is the same sort of experiment with the NN.
Now for the code (checkers one, let me know if you want to have a peek at the TicTacToe version, they are almost identical though).
I'll paste only once the beginning of both methods as they are absolutely identical, I will show both signatures as they differ slightly.
Small note to make the code more clear.
Board is the object which keeps track of pieces, available moves, which turn it is, if the game has been won/drawn etc...
Move is the object which contains all information pertinent to moves, when I make the clone as the first line of the method I simply make a clone of the given board and the constructor applies the given move to it.
private double miniMax(Board b, Move m, int depth) {
and
private double alphaBeta(Board b, Move m, int depth, double alpha, double beta) {
beginning of both methods:
Testboard clone = new Testboard(b, m);
// Making a clone of the board in order to
// avoid making changes to the original one
if (clone.isGameOver()) {
if (clone.getLoser() == null)
// It's a draw, evaluation = 0
return 0;
if (clone.getLoser() == Color.BLACK)
// White (Max) won, evaluation = 1
return 1;
// Black (Min) won, evaluation = -1
return -1;
}
if (depth == 0)
// Reached the end of the search, returning current Evaluation of the board
return getEvaluation(clone);
Regular MiniMax continuation:
// If it's not game over
if (clone.getTurn() == Color.WHITE) {
// It's white's turn (Maxing player)
double max = -1;
for (Move move : clone.getMoves()) {
// For each children node (available moves)
// Their minimax value is calculated
double score = miniMax(clone, move, depth-1);
// Only the highest score is stored
if (score > max)
max = score;
}
// And is returned
return max;
}
// It's black's turn (Min player)
double min = 1;
for (Move move : clone.getMoves()) {
// For each children node (available moves)
// Their minimax value is calculated
double score = miniMax(clone, move, depth-1);
// Only the lowest score is stored
if (score < min)
min = score;
}
// And is returned
return min;
}
MiniMax with Alpha/Beta pruning continuation:
// If it's not game over
if (clone.getTurn() == Color.WHITE) {
// It's white's turn (Maxing player)
for (Move move : clone.getMoves()) {
// For each children node (available moves)
// Their minimax value is calculated
double score = alphaBeta(clone, move, depth-1, alpha, beta);
if (score > alpha)
// If this score is greater than alpha
// It is assigned to alpha as the new highest score
alpha = score;
if (alpha >= beta)
// The cycle is interrupted early if the value of alpha equals or is greater than beta
break;
}
// The alpha value is returned
return alpha;
}
// It's black's turn (Min player)
for (Move move : clone.getMoves()) {
// For each children node (available moves)
// Their minimax value is calculated
double score = alphaBeta(clone, move, depth-1, alpha, beta);
if (score < beta)
// If this score is lower than beta
// It is assigned to beta as the new lowest score
beta = score;
if (alpha >= beta)
// The cycle is interrupted early if the value of alpha equals or is greater than beta
break;
}
// The beta value is returned
return beta;
}
I'm honestly stuck and I'm not sure what I could do to try and figure out what's going on. I've tried the MiniMax+A/B on several different even randomly generated neural networks but I've never seen an improvement when it comes to number of evaluations made. I hope someone here is able to shed some light on this situation, thanks!
Thanks @maraca for helping me figure this out, going to answer myself as he only replied with a comment.
There is nothing wrong with the code that I posted, the problem lies with the evaluation function that I was using once the search reached the depth limit.
I was using a still untrained Neural Network that was essentially just spitting out random values, this forced the MiniMax+A/B to go through all the nodes as there was no consistency with the answers which turns out is what is necessary for pruning to happen.