The question is as the title suggest.
I know that minimax algorithm does this for 2-people game (assume we want to maximize A's profit): when it is A’s turn, we take the max of the child values because we are maximizing A's profit, and when it is B’s turn, we take the min of the child values because we want to minimize B's profit.
However, I don't think the above logic proves that every sub-problem, the strategy is the most optimal in minimax algorithm. Any hints or solutions to the question I proposed? If the above logic does, then could you elaborate on it?
The claim minimax has is: Minimax is giving optimal strategy if the other player is also using the same strategy.
Base: For a leaf in the game tree - there is only one strategy, which minimax obviously chooses, and it is optimal - since it is the only one.
Hypothesis: minimax chooses optimal strategy for a game of depth d
.
Proof:
Let us look on a game of depth d+1
. There are obviously 2 possible scenarios:
max
phase. In this case, out of all possible moves we can do - minimax recursively evaluates the min-max value for each such strategy, and gives us the optimal result for a 'sub-game', where we have chosen this move. Each of these subgames is of depth d
, and from induction hypothesis, it is optimal.
min
phase. Similarly to the above logic, minimax splits the case to all possible moves the opponent can do, and evaluates their outcome. From Induction-Hypothesis, evaluations are optimal, and we choose the minimal from them - which will be what the opponent would have chosen, and is optimal strategy for him (and assume the opponent chooses optimally for every step of the way) - thus the opponent chooses the minimal evaluation, which minimize our profit, and thus maximize his profit.QED
Based on it, you can conclude another claim of minimax - the value returned by the strategy is not less than what you will actually get (regardless of the oppnent's strategy). This can be proven by induction very similarly to the above one.
Guidelines:
Base: One strategy only, and you are returning it, when the 'tree' is actually a leaf.
Claim: value returned by minimax for game of depth d
is a lower bound to all games played from this level, where you chose according to minimax, and no restriction on the opponent.
Proof:
Let's look at game of depth d+1
.
d
, where the induction hypothesis hold. The opponent can choose any one of these moves, so you can guarantee that after this move, you will have a value of at least min{game after opponent move}
, which is exactly what the min phase return.QED.