Search code examples
algorithmrecursionalphabetapruning

Understanding Alpha-Beta Pruning version for Minimax


I am currently working on my first project on C++ and chose to code a Connect Four (aka Score 4) with an AI based on Minimax, and more specifically on the Alpha Beta Pruning method.

So far, I understood that AB pruning consists in a recursive algorithm that takes into argument an alpha and a beta, which are "limits" beyond which you won't look for in your game tree. Also, we define a maximizing and a minimizing player, the former being the first player to have started playing the game. Finally, there is a "depth" which I understood as the "difficulty level" of the game, as the deeper an AI goes, the better it anticipates the moves; therefore the better chances there are for the computer to win the game.

However, my problem is the following. Suppose at some point that the computer notices its opponent (the human player) has a 3-streaks and is one move away from winning. Therefore my heuristic function should return -infinity to make the AI understand the "imminent danger" and to make it play to prevent the human player from winning: hence the recursion stops.

But here is the issue : when the recursion stops, the algorithm goes back to the previous layers of the game (the "shallower depths levels") where alternately, the 1st player will read max(alpha, alphabeta(depth - 1)) and the 2nd player min (beta, alphabeta(depth - 1)). This means that the -infinity score will necessarily be lost at some point, so that the AI may never understand that the game is lost.

Could someone please explain to me where I got wrong in the understanding of this algorithm ? A version of the pseudo-code can be found on Wikipedia.

Thanks very much in advance !


Solution

  • This means that the -infinity score will necessarily be lost at some point

    The -infinity will not be "lost" precisely because of the alternating min, max cutoff functions: if your algorithm detects a losing situation, it will deduce that the upper node is a losing move (by attributing it that -infinity value thanks to the min function), and so this branch will be pruned from the grand parent node (by selecting another branch with a higher score by virtue of the max function).

    Since another branch has been selected, the "bad" branch will not be taken by the ai player, and thus the losing move and the losing situation are avoided.