I am currently doing this problem in cs50 AI where we need to make a minimax algorithm for playing tictactoe. My algorithm doesn't work at all (it is really easy to beat the computer) and I was wondering what I was doing wrong. I am also pretty sure that all my other functions are correct and that only the minimax function is incorrect. Would really appreciate any help, thank you all!
import math, copy
X = "X"
O = "O"
EMPTY = None
def initial_state():
"""
Returns starting state of the board.
"""
return [[EMPTY, EMPTY, EMPTY],
[EMPTY, EMPTY, EMPTY],
[EMPTY, EMPTY, EMPTY]]
def player(board):
"""
Returns player who has the next turn on a board.
"""
xplayer = 0
yplayer = 0
for row in board:
for column in row:
if column == 'X':
xplayer += 1
elif column == 'O':
yplayer += 1
if xplayer == yplayer:
return X
else:
return O
def actions(board):
"""
Returns set of all possible actions (i, j) available on the board.
"""
ans = set()
rownum = 0
colnum = 0
for row in board:
colnum = 0
for column in row:
if not column:
ans.add((rownum, colnum))
colnum += 1
rownum += 1
return ans
def result(board, action):
"""
Returns the board that results from making move (i, j) on the board.
"""
if board[action[0]][action[1]] != None :
raise BoardError("Tried to place on full square")
move = player(board)
newboard = copy.deepcopy(board)
newboard[action[0]][action[1]] = move
return newboard
def winner(board):
"""
Returns the winner of the game, if there is one.
"""
for i in range(3):
sum = 0
for j in range(3):
if board[i][j] == 'X':
sum += 1
elif board[i][j] == 'O':
sum -= 1
if sum == 3:
return X
elif sum == -3:
return O
for j in range(3):
sum = 0
for i in range(3):
if board[i][j] == 'X':
sum += 1
elif board[i][j] == 'O':
sum -= 1
if sum == 3:
return X
elif sum == -3:
return O
if board[0][0] == board[1][1] == board[2][2]:
return board[0][0]
if board[2][0] == board[1][1] == board[0][2]:
return board[2][0]
return None
def terminal(board):
"""
Returns True if game is over, False otherwise.
"""
if winner(board):
return True
if not actions(board):
return True
return False
def utility(board):
"""
Returns 1 if X has won the game, -1 if O has won, 0 otherwise.
"""
if winner(board) == X:
return 1
elif winner(board) == O:
return -1
else:
return 0
def minimax(board):
"""
Returns the optimal action for the current player on the board.
"""
if player(board) == X:
aim = 1
elif player(board) == O:
aim = -1
if terminal(board):
return None
possiblemoves = actions(board)
for move in possiblemoves:
newboard = result(board,move)
#if move leads to the aimed score, return move
if utility(newboard) == aim:
return move
#if nodes down the chain return a value cos they have reached the aim, return this current move
if minimax(newboard):
return move
aim = 0
#change the aim to be a draw since winning is no longer possible
for move in possiblemoves:
newboard = result(board,move)
if utility(newboard) == aim:
return move
if minimax(newboard):
return move
#all the moves will result in a loss, so i just return the first move
return possiblemoves[0]
Basically X aims to maximise and O aims to minimise. What I have done for the algorithm is depending on the player, first look for moves that will result in either 1 or -1 depending on the player. Then if that doesn't happen, look for moves that result in 0(a tie). Then afterwards just simply return any move since that means the player will lose.
You seem to have lots of unnecessary functions and your minimax code looks way too complicated than it needs to be. Basically you need 4 main functions for your game:
Also, have you looked at the pseudo code for minimax from e.g. Wikipedia? :
function minimax(node, depth, maximizingPlayer) is
if depth = 0 or node is a terminal node then
return the heuristic value of node
if maximizingPlayer then
value := −∞
for each child of node do
value := max(value, minimax(child, depth − 1, FALSE))
return value
else (* minimizing player *)
value := +∞
for each child of node do
value := min(value, minimax(child, depth − 1, TRUE))
return value
Here is the general idea:
Depth is not necessary since tic-tac-toe is a simple game where we can always reach an end state. You can use depth to ensure that the computer will pick the shortest way possible to victory by adding depth to the heuristics return value. But I suggest you first get it to work without complicating things :)