I'm banging my head against the wall trying to implement negamax for tic-tac-toe
def negamax(board_obj, mark, depth)
if board_obj.game_over?
return value(board_obj)
else
max = -1.0/0 # negative infinity
if mark == @mark
next_mark = @opponent_mark
else
next_mark = @mark
end
board_obj.empty_squares.each do |square|
board_obj[square] = mark
x = -negamax(board_obj, next_mark, depth + 1)
board_obj[square] = ' '
if x > max
max = x
@scores << x
@best_move = square if depth == 1
end
end
return max
end
end
# determines value of final board state
def value(board_obj)
if board_obj.mark_win?(@mark)
return 1
elsif board_obj.mark_win?(@opponent_mark)
return -1
else
return 0
end
end
the rest of the code is here: https://github.com/dave-maldonado/tic-tac-doh/blob/AI/tic-tac-doh.rb
It does produce a result but the AI is easily beat so I know something's wrong, any help is appreciated!
The problem is that value
needs to be relative to the mark
in the current execution of negamax
rather than always relative to the computer. If you pass in the mark
argument to value
from negamax
with the following modified definition for value
, you'll get the right results:
def value(board_obj, mark)
if board_obj.mark_win?(mark)
return 1
elsif board_obj.mark_win?(mark == 'X' ? 'O' : 'X')
return -1
else
return 0
end
end
That is, the first two lines of the negamax
body need to be:
if board_obj.game_over?
return value(board_obj, mark)
That said, this overall program leaves an awful lot to be desired relative to Ruby, good design principles, etc (no offense intended). Now that you have it running, you might want to head over to the Code Review SE for some feedback. :-) And while it's too late to use TDD ;-), it would also be a good one to put "under test".
Also, please understand that per one of the other comments, this is not a kind of question that you'll typically get an answer to here at SO. I don't even know if this question will survive the review process without getting deleted. I worked on it for a variety of personal reasons.
Update: Looking at your reference implementation, you'll note that the negamax code includes the expression sign[color]*Analysis(b)
. It's that sign[color]
that you were missing, effectively.