beta in alpha beta search

Hi! I'm trying to implement an alpha-beta search, but i first want to understand all the logic behind it, not just implementing it using some kind of pseudo code.

What i understand is this: A white player makes a move(let's call it move1). The first move is saved as alpha (the minimum value the player is assured of). Now, if we move to the next possible move by white(move2), and see that the black player's first response results a valuation that is worse than alpha, we can skip all possible black's counter moves as we already know that when white makes move2, the worst possible result is worse than move1's worst possible result.

But, what i don't understand is that beta variable. From chess programming wiki i read : ' the maximum score the minimizing player is assured of '. But i can't really get the idea behind it.

Can somebody please explain it in very simple terms? Thank you very much.

Solution

In chess, there is no easy way to tell if move1 is better than move2 (from your example). An approximation is achieved by looking at "hard" parameters: Number and value of pieces, presence of double or free pawns, ... Usually such an approximation is plugged into the minimax algorithm.

Minimax

Simply speaking, the idea is as follows: First, all possible moves are expanded (white-black-white-black-...) until a predefined depth or time limit has been reached. This creates a tree of board positions (with moves as edges), and the leaves are evaluated using an heuristics (as described above). Then, the tree is collapsed, leading in the end to an evaluation of move1 vs. move 2.

How does the collapsing work? It starts at the leaves of the tree and assigns a value to each node (aka board position). For each node for which the value of all children is known, the childs' values are aggregated: If it's white's turn, then the best value for white is taken (max); if it's black's turn, the worst (min). Hence the name minimax. This is repeated until the root has been reached.

Assume the following tree of board positions:

 A
|  \
B1  B2
|   |  \
A11 A21 A22

Now assume the following evaluation: A11 = 0, A21 = -1, A22 = +1 (positive value is good for white). We assume from our approximation that position A21 is better than A22 (for black), so we assign -1 to the node B2. For B1 this is clear, its value is 0. Now we assume that B1 is better than B2 for white, hence A's value is 0, and white should move to achive position B1.

Alpha-beta pruning

The idea here is not to build the whole tree, but to do a depth-first search for the more promising moves in order to achieve an early cutoff. In the example above, if we walk the tree depth-first from left to right (A-B1-A11-B2-A21-...), we know after evaluating A21 that position B2 is worse than position B1 for white. Hence, there is no need to evaluate A22 anymore. Alpha and beta simply store the evaluations of the currently known best possible move for white, and the currently known best possible reply for black. The order in which the nodes of the tree are walked (initial ordering) determines if and how many cutoffs are possible. From Wikipedia:

Normally during alpha-beta, the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try to find a refutation), or vice versa. ...

If ordering is suboptimal, more subtrees will have to be explored completely.

Optimization

Strictly speaking, the tree is a DAG, as identical board positions can be achieved through different combinations of moves (e.g., transpositons). Employ a hash table to detect identical positions, this is going to save a lot of computational effort.