How the AI Works
The AI uses the minimax
search algorithm to find a next move. Basically the AI looks at all the possible moves that the player MAX can make (player MAX because the current player is trying to maximize the utility of its move), and selects the move with the highest predicted utility. How does the player MAX calculate the predicted utility of a move or state? Player MAX makes a recursive call to the minimax algorithm, except,
this time the next player will be player MIN who is trying to minimize the utility of the positions. Player MIN, like player MAX looks at all the possible moves, but choses the move with the lowest predicted utility. How does player MIN calculate the predicted utility of a move? Well you can already guess: making a recursive call to the minimax algorithm. However, at some point these recursive calls must stop. In the case of tic-tac-toe, this occurs, when there no valid moves
for the next player to make, either because the board has been won or completely filled up with pieces. When one of these terminal positions is encountered, the utility function is applied to the state:
The board is a victory for player MAX: return 1
The board is a tie: return 0
The board is a defeat for player MAX: return -1
Speeding up Minimax Search with Alpha Beta Pruning and a Transposition Table
The minimax search algorithm's efficiency can be dramatically improved with alpha beta pruning and a transposition table. These optimizations provide an exponential decrease in running time while guaranteeing to never return a state with a utility value less than the value of what the vanilla minimax search returns. The logic behind alpha beta pruning is illustrated in Figure 1. In chess, a transposition is a sequence of moves that result in a position that can be reached by one or
alternate sequences of moves. A transposition table is essentially a hash table of all positions that have been seen in a given minimax search. A transposition table is therefore essentially a form of memoization.
Challenges in Combining Alpha Beta Pruning and a Transposition Table
During development, one of the most challenging bugs I faced involved naively combining alpha beta pruning and a transposition table. It turns out that naively combining the two optimizations introduces a bug causing the minimax search to return suboptimal states. A breakthrough occurred when I noticed that my implementation seemed to play perfectly when it only had alpha beta pruning and in addition seemed to play perfectly when it only had a transposition table.
The suboptimal results only seemed to occur when both optimizations were present. I realized at this moment that somehow combining both alpha beta pruning with a transposition table must cause a bug. The problem is that a transposition table depends on the minimax search to return exact values, but when alpha beta pruning is present, whenever a cutoff occurs, the value returned is either an upper bound or a lower bound on the true value. An internet search confirmed my
suspicions and provided pseudo code for a fix to the problem. You can find my code on github.
Written by Chet Weger. Questions, comments, bugs? Contact me at chetweger [at] gmail.com.
See my similar projects including Self-Learning Meta-Tic-Tac-Toe and Interactive Meta-Tic-Tac-Toe. Code for this project is available at https://github.com/chetweger/min-max-games.
Return to my home page.