I'm implementing a Tic-Tac-Toe game with an AI using the Minimax algorithm (with Alpha-Beta Pruning) to select optimal moves. However, I'm experiencing performance issues when the board size increases or the number of consecutive marks required to win (k) grows. The current implementation works fine for smaller board sizes (e.g., 3x3), but as I scale up the board size to larger grids (e.g., 8x12), the performance drops significantly, especially with the depth of the search.
The algorithm is running too slowly on larger boards (e.g., 6x6 or 8x12) with more complex win conditions (e.g., 4 in a row, 5 in a row).
import random
import time
# Check if a player has won
def check_win(board, player, k):
rows = len(board)
cols = len(board[0])
# Check horizontally, vertically, and diagonally
for r in range(rows):
for c in range(cols):
if c <= cols - k and all(board[r][c+i] == player for i in range(k)): # Horizontal
return True
if r <= rows - k and all(board[r+i][c] == player for i in range(k)): # Vertical
return True
if r <= rows - k and c <= cols - k and all(board[r+i][c+i] == player for i in range(k)): # Diagonal
return True
if r >= k - 1 and c <= cols - k and all(board[r-i][c+i] == player for i in range(k)): # Anti-diagonal
return True
return False
# Heuristic function to evaluate the board state
def evaluate_board(board, player, k):
# Check if the player has already won
if check_win(board, player, k):
return 1000 # Win
elif check_win(board, 3-player, k): # Opponent
return -1000 # Loss
return 0 # No winner yet
# Minimax algorithm with Alpha-Beta Pruning
def minimax(board, depth, alpha, beta, is_maximizing_player, player, k):
# Evaluate the current board state
score = evaluate_board(board, player, k)
if score == 1000 or score == -1000:
return score
if depth == 0:
return 0 # Depth limit reached
if is_maximizing_player:
best = -float('inf')
for r in range(len(board)):
for c in range(len(board[0])):
if board[r][c] == 0: # Empty cell
board[r][c] = player # Make move
best = max(best, minimax(board, depth-1, alpha, beta, False, player, k))
board[r][c] = 0 # Undo move
alpha = max(alpha, best)
if beta <= alpha:
break
return best
else:
best = float('inf')
for r in range(len(board)):
for c in range(len(board[0])):
if board[r][c] == 0: # Empty cell
board[r][c] = 3 - player # Opponent's move
best = min(best, minimax(board, depth-1, alpha, beta, True, player, k))
board[r][c] = 0 # Undo move
beta = min(beta, best)
if beta <= alpha:
break
return best
# Function to find the best move for the AI
def best_move(board, player, k):
best_val = -float('inf')
best_move = (-1, -1)
for r in range(len(board)):
for c in range(len(board[0])):
if board[r][c] == 0: # Empty cell
board[r][c] = player # Make move
move_val = minimax(board, 4, -float('inf'), float('inf'), False, player, k) # Depth 4 for searching
board[r][c] = 0 # Undo move
if move_val > best_val:
best_move = (r, c)
best_val = move_val
return best_move
# Function to play the game
def play_game(board_size=(3, 3), k=3):
board = [[0 for _ in range(board_size[1])] for _ in range(board_size[0])]
player = 1 # 1 - X, 2 - O
while True:
# Display the board
for row in board:
print(' '.join(str(x) if x != 0 else '.' for x in row))
print()
# Check if someone has won
if check_win(board, player, k):
print(f"Player {player} wins!")
break
# AI's move
if player == 1:
row, col = best_move(board, player, k)
print(f"AI (X) move: {row}, {col}")
else:
# Simple random move for player O
row, col = random.choice([(r, c) for r in range(board_size[0]) for c in range(board_size[1]) if board[r][c] == 0])
print(f"Player O move: {row}, {col}")
board[row][col] = player
player = 3 - player # Switch player
# Check for a draw
if all(board[r][c] != 0 for r in range(board_size[0]) for c in range(board_size[1])):
print("It's a draw!")
break
# Speed test function
def test_speed(board_size=(3, 3), k=3):
start_time = time.time()
play_game(board_size, k)
print(f"Game duration: {time.time() - start_time} seconds")
# Example: play on a 8x8 board, with 4 in a row to win
test_speed((8,8), 4)
I tried limiting the depth of the Minimax search to reduce the number of possibilities being evaluated. For example, I reduced the search depth to 2 for larger boards:
move_val = minimax(board, depth=2, alpha=-float('inf'), beta=float('inf'), is_maximizing_player=False, player=player, k=k)
However, this only partially mitigates the issue, and the AI's decision-making still seems too slow.
Also i explored parallelizing the move evaluation using the ThreadPoolExecutor from Python’s concurrent.futures module:
from concurrent.futures import ThreadPoolExecutor
def parallel_minimax(board, depth, alpha, beta, is_maximizing_player, player, k):
with ThreadPoolExecutor() as executor:
futures = []
for r in range(len(board)):
for c in range(len(board[0])):
if board[r][c] == 0:
board[r][c] = player
futures.append(executor.submit(minimax, board, depth-1, alpha, beta, False, player, k))
board[r][c] = 0 # Undo move
for future in futures:
result = future.result()
# [process the result]
This approach improved parallelism, but it still does not solve the underlying issue of time complexity for large boards.
Please help to make the algorithm as fast as possible
Minimax, even with alpha-beta pruning, will have to look at an exponentially growing number of states. For larger board sizes this will mean you can only perform shallow searches.
I would suggest to switch to the Monte Carlo Search algorithm. It uses random sampling, making decisions whether to explore new branches in the search tree or to deepen existing ones. You can check out the Wikipedia page on Monte Carlo tree search for more information.
You mention 4-in-a-row, but realise that on large boards (like 8x12), that is an easy win for the first player.
Below is an implementation I made for answering your question. It defines a TicTacToe
class with methods like best_move
, play_row_col
, ... But it inherits from a more generic class which provides the core Monte Carlo search functionality with its mc_search
method. That superclass has no knowledge of the game logic; it only assumes two players and that it is turn-based with at each turn a finite number of moves. It will only refer to these moves with a sequential number, and will depend on the subclass methods to perform the corresponding moves. It is responsible for driving the search directions.
The TicTacToe
class does not have search logic. It depends on the superclass for that. Although it is not necessary, I decided to "help" find winning lines more effectively, and still add some logic in TicTacToe
that narrows the list of moves that the algorithm should consider, in two ways:
Only moves that are neighboring (possibly diagonally) occupied cells are considered. For the very first move only the center move is considered. This is a limitation that you might consider too strong, as in the beginning of the game on a large board, strong players may prefer to place their pieces further away from the other pieces. Still, I found that this restriction for the search algorithm still allows for reasonable good play (it's all relative to what strength you expect).
If there is a winning move for the player who's turn it is, this is detected, and only that move will be in the move list. If the last played piece created a "threat" to win on their next move, this also leads to a move list with just one move for the opponent, since they will want to defend.
These restrictions are only applicable for the search algorithm, not for the human player (of course).
As on larger boards the game may need a lot of moves to come to an end, I also added a parameter to limit the search depth in the Monte Carlo rollout phase, and consider the outcome a draw when there is no win within that number of moves counting from the start of the rollout. It could be set to 20 or 30 for example.
Here is the code:
from __future__ import annotations
from math import log
from time import perf_counter_ns
from random import shuffle, randrange
from abc import abstractmethod
from typing import Callable
from enum import StrEnum
from copy import deepcopy
# Helper function to get best list entry based on callback function
def best_index(lst: [any], evaluate: Callable[[any], int]) -> int:
results = list(map(evaluate, lst))
return results.index(max(results))
# Rather generic, abstract class for 2-player game
class MonteCarloGame:
class States(StrEnum):
FIRST_PLAYERS_TURN = "X"
SECOND_PLAYERS_TURN = "O"
FIRST_PLAYER_WON = "Player X won"
SECOND_PLAYER_WON = "Player O won"
DRAW = "It's a draw"
TerminalStates = (States.FIRST_PLAYER_WON, States.DRAW, States.SECOND_PLAYER_WON)
TurnStates = (States.FIRST_PLAYERS_TURN, States.SECOND_PLAYERS_TURN)
# Node in the Monte Carlo search tree has the sum of scores over a number of games:
class Node:
def __init__(self, num_children: int, last_player: MonteCarloGame.States):
self.last_player = last_player
self.score = 0
self.max_score = 0
self.children: [MonteCarloGame.Node] = [None] * num_children
self.unvisited = list(range(num_children))
shuffle(self.unvisited)
# To get a random child (i.e. move to next state)
def pick_child(self) -> int:
if self.unvisited: # If there are unvisited child nodes, choose one of those first
return self.unvisited.pop() # = a randomly chosen Node to expand
if not self.children:
raise ValueError("Cannot call pick_child on a terminal node")
# Use UCB1 formula to choose which node to explore further
log_n = log(self.max_score)
return best_index(self.children, lambda child: child.score / child.max_score + (log_n / child.max_score)**0.5)
def update_score(self, score: int):
self.max_score += 2 # Every simulated game counts for 2 points (i.e. the maximum score)
# Make the score relative to the player who made the last move
self.score += 2 - score if self.last_player == MonteCarloGame.States.FIRST_PLAYERS_TURN else score # A score can be 0, 1 or 2 (loss, draw, win).
def select_or_expand(self, index: int, num_children: int, last_player: MonteCarloGame.States) -> MonteCarloGame.Node:
child = self.children[index] or MonteCarloGame.Node(num_children, last_player)
self.children[index] = child
return child
def __init__(self):
self.mc_root = None
@abstractmethod
def copy(self) -> MonteCarloGame:
return self # must implement
@abstractmethod
def size_of_move_list(self) -> int:
return 0 # must implement
@abstractmethod
def state(self) -> States:
return self.States.FIRST_PLAYERS_TURN # must implement
@abstractmethod
def play_from_move_list(self, move_index: int) -> States:
return self.States.FIRST_PLAYERS_TURN # must implement
# To be called when a move is played from the subclass, so
# that the root can "follow" to the corresponding child node
def mc_update(self, move_index: int):
if not self.mc_root or move_index == -1:
self.mc_root = MonteCarloGame.Node(self.size_of_move_list(), self.state())
else:
self.mc_root = self.mc_root.select_or_expand(move_index, self.size_of_move_list(), self.state())
# Main Monte Carlo Search algorithm:
def mc_search(self, timeout_ms: int, rollout_depth: int) -> int:
if self.state() in self.TerminalStates:
return -1 # Nothing to do
if not self.mc_root:
self.mc_root = MonteCarloGame.Node(self.size_of_move_list(), self.state())
expiry = perf_counter_ns() + timeout_ms * 1_000_000
while perf_counter_ns() < expiry:
# Start at root (current game state)
game = self.copy()
node = self.mc_root
state = game.state()
path = [node]
# Traverse down until we have expanded a node, or reached a terminal state
while state in self.TurnStates and node.max_score:
move_index = node.pick_child()
next_state = game.play_from_move_list(move_index)
node = node.select_or_expand(move_index, game.size_of_move_list(), state)
path.append(node)
state = next_state
# Rollout
score = 1 # Default outcome is a draw (when depth limit is reached)
for depth in range(rollout_depth): # Limit the depth
if state not in self.TurnStates:
score = self.TerminalStates.index(state) # Absolute score (good for player #2)
break
state = game.play_from_move_list(randrange(game.size_of_move_list()))
# Back propagate the score up the search tree
for node in path:
node.update_score(score)
# Choose the most visited move
return best_index(self.mc_root.children, lambda n: n.max_score if n else 0)
class TicTacToe(MonteCarloGame):
class Contents(StrEnum):
INVALID = "#"
EMPTY = "."
FIRST_PLAYER = MonteCarloGame.States.FIRST_PLAYERS_TURN
SECOND_PLAYER = MonteCarloGame.States.SECOND_PLAYERS_TURN
def __init__(self, num_rows: int=3, num_cols: int=3, win_length: int=3, *, source: TicTacToe=None):
super().__init__()
if source: # copy
self.__dict__ = deepcopy(source.__dict__)
return
self.num_rows = num_rows
self.num_cols = num_cols
self.win_length = win_length
width = num_cols + 1 # include dummy column
bottom = width * (num_rows + 1)
# Surround board with dummy row and column, and create a flat representation:
self.board = [
(self.Contents.INVALID if i < width or i >= bottom or i % width == 0 else self.Contents.EMPTY)
for i in range(width * (num_rows + 2) + 1) # flat structure
]
self.free_cells_count = num_cols * num_rows
# Maintain a reasonable move list (not ALL free cells) to keep search efficient
# At the start we only consider one move (can adapt to allow some more...)
# This list is not relevant when a "human" player plays a move (any free cell is OK)
self.move_list = [self.row_col_to_cell(num_rows//2, num_cols//2)]
self._state = self.States.FIRST_PLAYERS_TURN
# Add some information about current threats on the board (to narrow rollout paths)
self.open_threats: [int] = [] # board cell indices where the last player has opportunity to win
self.immediate_win = 0 # board cell index where the next move wins the game
self.forced_cell = 0 # derived from previous two attributes
def copy(self) -> TicTacToe:
return TicTacToe(source=self)
def size_of_move_list(self) -> int:
if self._state in self.TerminalStates:
return 0
# If we are on a forced line to a win or to avoid a loss, consider one move only
if self.forced_cell:
return 1
return len(self.move_list)
def state(self) -> MonteCarloGame.States:
return self._state
def play_from_move_list(self, move_index: int) -> MonteCarloGame.States:
return self.play_at_cell(self.move_index_to_cell(move_index))
def play_at_cell(self, cell: int) -> MonteCarloGame.States:
n = len(self.board)
turn = self._state
self._state = self.TurnStates[1-self.TurnStates.index(turn)] # toggle
self.board[cell] = self.Contents(turn) # Type cast to silence warning
self.free_cells_count -= 1
if cell in self.move_list:
self.move_list.remove(cell)
# Add surrounding cells to move_list for next turn
w = self.num_cols
for neighbor in (cell-1,cell+1,cell-w-2,cell-w-1,cell-w,cell+w,cell+w+1,cell+w+2):
if self.board[neighbor] == self.Contents.EMPTY and neighbor not in self.move_list:
self.move_list.append(neighbor)
threats = set() # Collect cells where the move creates a threat (where opponent must play at)
# check if this move makes a direct win or a threat
for step in (1, self.num_cols, self.num_cols + 1, self.num_cols + 2):
span = self.win_length * step
i = next(k for k in range(cell - step, -1, -step) if self.board[k] != turn)
j = next(k for k in range(cell + step, n, step) if self.board[k] != turn)
if j - i > span:
# it's an immediately winning move
self._state = (self.States.FIRST_PLAYER_WON if turn == self.States.FIRST_PLAYERS_TURN
else self.States.SECOND_PLAYER_WON)
return self._state
# look for a single gap in a line, such that this gap is a threat
if self.board[i] == self.Contents.EMPTY:
i2 = next(k for k in range(i - step, -1, -step) if self.board[k] != turn)
if j - i2 > span:
threats.add(i)
if self.board[j] == self.Contents.EMPTY:
j2 = next(k for k in range(j + step, n, step) if self.board[k] != turn)
if j2 - i > span:
threats.add(j)
if not self.free_cells_count:
self._state = self.States.DRAW
self.immediate_win = next((c for c in self.open_threats if c != cell), 0)
self.open_threats = list(threats)
self.forced_cell = self.immediate_win or next(iter(self.open_threats), 0)
return self._state
def play_row_col(self, row: int, col: int) -> MonteCarloGame.States:
if not 0 <= row < self.num_rows or not 0 <= col < self.num_cols:
raise ValueError("row/col out of range")
cell = self.row_col_to_cell(row, col)
if self.board[cell] != TicTacToe.Contents.EMPTY:
raise ValueError("cell is already occupied")
move_index = self.cell_to_move_index(cell)
state = self.play_at_cell(cell)
self.mc_update(move_index) # must be called after the move is played
return state
# Some methods that convert between different ways to identify a move
def move_index_to_cell(self, move_index: int) -> int:
# Convert index in the move list to cell in the flattened board
# If we are on a forced line to a win or to avoid a loss, consider that move only
return self.forced_cell or self.move_list[move_index]
def cell_to_move_index(self, cell: int) -> int:
# Convert cell in the flattened board to index in the move list
cells = [self.forced_cell] if self.forced_cell else self.move_list
try:
return cells.index(cell)
except ValueError:
return -1
def row_col_to_cell(self, row: int, col: int) -> int:
# Convert row / col to the cell in the flattened board
return (row + 1) * (self.num_cols + 1) + col + 1
def cell_to_row_col(self, cell: int) -> tuple[int, int]:
# Convert cell in flat board structure back to row/col:
return cell // (self.num_cols + 1) - 1, cell % (self.num_cols + 1) - 1
# The method that starts the search for the best move, and returns it
def best_move(self, timeout_ms: int, rollout_depth: int=15) -> tuple[int, int]:
return self.cell_to_row_col(self.move_index_to_cell(
self.mc_search(timeout_ms, rollout_depth)
))
def __repr__(self) -> str:
return "#" * (self.num_cols * 2 + 2) + " ".join(
(content, "#\n#")[content == self.Contents.INVALID]
for content in self.board[self.num_cols+1:-self.num_cols-2]
) + " #\n" + "#" * (self.num_cols * 2 + 3)
def main():
# Create the game instance and set parameters
game = TicTacToe(num_rows=8, num_cols=12, win_length=5)
timeout_ms = 2000 # milliseconds that tree search may spend for determining a good move
rollout_depth = 20 # max number of moves in a rollout game: if reached it counts as a draw
human = (MonteCarloGame.States.SECOND_PLAYERS_TURN, )
# Game loop
state = game.state()
while state in MonteCarloGame.TurnStates:
print(game)
print(f"Player {state} to move...")
if state in human:
try:
state = game.play_row_col(int(input(f"Enter row (0-{game.num_rows-1}): ")),
int(input(f"Enter column (0-{game.num_cols-1}): ")))
except ValueError:
print("Invalid move. Try again:")
else:
state = game.play_row_col(*game.best_move(timeout_ms, rollout_depth))
print("Computer has played:")
print(game)
print(f"Game over: {state}")
main()
The above can be run as-is and will play a computer-vs-human game on a 8x12 board, with a 5-in-a-row target, and the computer getting 2 seconds time per move, and a rollout depth of 20. Have a go at it, and try to beat it. It is possible.
This is not an end-product, but I think this program plays reasonable games.
For smaller boards, like 3x3, you don't need 2 seconds, and could set it to just 100 milliseconds.
Several things could be improved, like:
The search could keep going during the time that the human player is to enter their move.
The search time could be more dynamic so that it would return a best move faster when it becomes clear that one move stands out among the rest.
More logic could be added to the TicTacToe
class so that it would detect forced lines at an earlier stage and limit the move list (for the Monte Carlo search) to the relevant moves to follow those lines. Think of the creation of double threats, ...etc.
The depth limit for the rollout phase could be made more dynamic, depending on the size of the board, the time still available,...
I hope this meets some of your requirements and this gives some leads on how to further improve it. At least I was happy with the result that less than 300 lines of code could give.