Search code examples
treemachine-learningartificial-intelligencemultiplayer

MCTS Handling N player games opponent actions


I would like to know how N player games are handled in MCTS. Are the actions of the opponent embedded in the search tree? Is their value generated the same way as the other Actions? If so wouldn't their value change the total value of the parent state in a wrong way? mcts.ai is a nice helpful site but concerning n-player games. The example code just states "would need extra logic for n-player games"

Thank you in advance.


Solution

  • In fact it is not as easy, as just modeling few additional players that maximize their own profits.

    There have been at least few different approaches to the problem of multi-player games, including:

    • max^n (the simplest)
    • Paranoid
    • Best Reply Search (BRS)
    • Coalition-Mixer

    The main issue in the MCTS based approaches is to find an equlibrium between lighweight simulations/evaluations and knowledge embeeded in them. Multi-player games introduce their own parameters to this complex equation, and as a result - there are some interesting modifications that can find better solution (in terms of limited resources) then naive approaches. One such method is "Playout search", described in detail in Playout Search for Monte-Carlo Tree Search in Multi-Player Games.

    The most important difference between 2-player and multi-player games is the fact, that in most of the 2-player games the point system is somehow "symmetric" - if I win, you lose, and vice versa. So, assuming that I want to win, I can think about it as a battle between me trying to win, and my opponent trying to win. Once we introduce the third player it is not as simple anymore. Now, if I win - everything is ok. But two other players does not have to play to win, it is enough for them to make me lose (and either of them wins), which makes a foundation of the paranoid strategy - we assume that all players play against us, without caring about who actually wins. This alternates the required model (as they no longer maximize any profit), and is just one of possible scenarios. With N players on the board, amount of possible coallitions (and their mixes) is enormous.