machine-learning artificial-intelligence tic-tac-toe minimax alpha-beta-pruning

How to make a Self-Improving Tic Tac Toe Mega (10x10 board) game

I made a Tic Tac Toe (10x10 board size) bot. It plays better than an average human.

The board size is 10x10 instead of 3x3. 5 in a row O's or X's must be placed to win, instead of 3.

So, I made that bot using Minimax + Board Evaluation Function + Limited Available Moves to improve performance.

Let me explain my code.

First, I used Minimax algorithm alone but realized that. There are around 100 possible states after making first move, 100*99 after making second move, 100*99*98 after making 3rd move.

And it is probably impossible to count all possible board states.

So, What I did was created a board evaluation function.

I set some rules for board evaluation function, and it is same, no matter how many games the Bot play.

But I want to make a board evaluation function, that improves itself or give me some data, that I can use it to improve it. I cannot think of any way in Tic Tac Toe, can you guys?

Thanks

Solution

One method of doing this would be to generate statistics on board states. Create a board hash function that's 1:1 with effective board states, and populate a dictionary of moves taken. Record wins/losses for each move in each board state, and apply a weight to the move selection based on the win% of a given option.

This is memory intensive, but you can improve that by a factor of 8 by using a hash that's invariant on board rotation and mirroring (trivially, you could hash all 8 rotations and flips of the current state, and always return the minimum, for example; there might be a less brute force option.)

An additional improvement is to not record moves for any games which you are guaranteed to win/lose in your look ahead window, though that's a smaller percentage improvement.