Python: Coding unbeatable Tic Tac Toe AI

Noughts & Crosses (or “Tic Tac Toe” for our American friends) offers up a nice coding challenge, being not too graphically complex but still requiring the thought and logic needed behind a nice programming project. It IS of course a game where if both parties play optimally the outcome is always a draw, but how do we code “optimally”? And how do we take advantage of players who don’t?

Whilst learning Python I came across the following resource, which offers up a very nice Noughts & Crosses coding tutorial. I highly recommend the tutorials for learning Python:

http://inventwithpython.com/chapter10.html

One thing bugged me though. It is very easy to beat the AI suggested. I’ll be honest and say that I was only “99.9% sure” that regardless of going first or second, it is possible to draw or win every time. A quick google confirmed this, so let’s have a look at a way of making an unbeatable AI.

Beatable AI

The algorithm in the tutorial above may have been chosen for simplicity. It offers an AI that would certainly outplay a “random” player.

  1. Check for a square that CPU can win on
  2. Check for square that player can win on
  3. Move on a free corner
  4. Move on free center
  5. Move on a free side

Coding it up

Let’s code both the general game code, and the computer decision code.

Game code
Playing = True
while Playing:
    InGame = True
    board = [' '] * 9
    print('Would you like to go first or second? (1/2)')
    if input() == '1':
        playerMarker = '0'
    else:
        playerMarker = 'X'
    displayBoard(board)

    while InGame:
        if playerMarker == '0':
            print('Player go: (0-8)')
            move = int(input())
            if board[move] != ' ':
                print('Invalid move!')
                continue
        else:
            move = getComputerMove(board)
        board[move] = playerMarker
        if checkWin(board, playerMarker):
            InGame = False
            displayBoard(board)
            if playerMarker == '0':
                print('Noughts won!')
            else:
                print('Crosses won!')
            continue
        if checkDraw(board):
            InGame = False
            displayBoard(board)
            print('It was a draw!')
            continue
        displayBoard(board)
        if playerMarker == '0':
            playerMarker = 'X'
        else:
            playerMarker = '0'

    print('Type y to keep playing')
    inp = input()
    if inp != 'y' and inp != 'Y':
        Playing = False

I leave some of the unnecessary extra stuff as an exercise: checking for valid input, counter for wins, losses, draws etc. Crosses (“X”) are the computer, noughts (“O”) are the player.

The general gist is a big outer loop, which manages each game. Within that loop is another loop that manages each turn and checks if anyone has won or if it’s a draw.

You’ll notice there are four user defined functions in the above.  Note: All functions (code starting def) need to be put ABOVE the looping code.

displayBoard(board) – prints out a representation of the board to the console

def displayBoard(b):
    print('     |     |')
    print('  ' + b[0] + '  |  ' + b[1] + '  |  ' + b[2])
    print('     |     |')
    print('-----------------')
    print('     |     |')
    print('  ' + b[3] + '  |  ' + b[4] + '  |  ' + b[5])
    print('     |     |')
    print('-----------------')
    print('     |     |')
    print('  ' + b[6] + '  |  ' + b[7] + '  |  ' + b[8])
    print('     |     |')

Take note that the board is therefore numbered like below:

numberedBoard

checkWin(board, currentPlayer) – checks for a three in a row

def checkWin(b, m):
    return ((b[0] == m and b[1] == m and b[2] == m) or  # H top
            (b[3] == m and b[4] == m and b[5] == m) or  # H mid
            (b[6] == m and b[7] == m and b[8] == m) or  # H bot
            (b[0] == m and b[3] == m and b[6] == m) or  # V left
            (b[1] == m and b[4] == m and b[7] == m) or  # V centre
            (b[2] == m and b[5] == m and b[8] == m) or  # V right
            (b[0] == m and b[4] == m and b[8] == m) or  # LR diag
            (b[2] == m and b[4] == m and b[6] == m))  # RL diag

checkDraw(board) – checks if board is full

def checkDraw(b):
    return ' ' not in b

getComputerMove(board) – returns computer move

Now, we’ve set the above game up in such a way that we can handle all the computer decisions in the function above, just by inputting the current board.

AI Code

Our game calls the “getComputerMove” function, passing only the board state, so the function is independent and will see the board and spit out the number of where it wants to go next.

Task: Make a function that can “test” if picking a particular square will mean a win. Because we don’t want to change the actual board when we test moves, we also need a function to make a temporary board.

def getBoardCopy(b):
    # Make a duplicate of the board. When testing moves we don't want to 
    # change the actual board
    dupeBoard = []
    for j in b:
        dupeBoard.append(j)
    return dupeBoard

def testWinMove(b, mark, i):
    # b = the board
    # mark = 0 or X
    # i = the square to check if makes a win 
    bCopy = getBoardCopy(b)
    bCopy[i] = mark
    return checkWin(bCopy, mark)

We call this first for each square to check if the computer can win, then for each square to check if the player can win (so the computer needs to block). Now we have all we need:

def getComputerMove(b):
    # Check computer win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, 'X', i):
            return i
    # Check player win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, '0', i):
            return i
    # Play a corner
    for i in [0, 2, 6, 8]:
        if b[i] == ' ':
            return i
    # Play center
    if b[4] == ' ':
        return 4
    #Play a side
    for i in [1, 3, 5, 7]:
        if b[i] == ' ':
            return i

Problems

So we’ve got a reasonable AI. How can this be beat? In short, the computer doesn’t go center when it should. For example, with computer going first as ‘X’ and player second as ‘O’:

imp0win2

which can be done exactly the same way if the player goes first. Another way that works if the player(‘O’) goes first:

imp0win3

These wins are of course rotatable, offering up more combinations of moves that result in wins.

To test this (and other algorithms) I made a simulation script that ran 10 000 brute force games; the algorithm against a “dumb” player selecting any free square at random. The results for this algorithm:

Player first: 8088 cpu, 1410 draws, 501 player

CPU first:     9689 cpu, 254 draws, 57 player

This needs improving. The player gets a good amount of wins going first and even gets some going second!

Making it Unbeatable

Let’s try and make an algorithm that is unbeatable.

Forks

What the above is missing is the whole concept of “forks”. This is when a player makes a move that gives them two opportunities at a three in a row. It’s effectively a “checkmate in 1” situation that means the other player can’t block both three in a row possibilities, and therefore loses. In the below situation, either of the moves indicated by green crosses would create a fork for crosses.

fork

The computer should make a move that gives itself a fork opportunity. If not, it should block moves that give the player a fork opportunity.

Coding It Up

Function to test if a move i would give two ways of winning j.

def testForkMove(b, mark, i):
    # Determines if a move opens up a fork
    bCopy = getBoardCopy(b)
    bCopy[i] = mark
    winningMoves = 0
    for j in range(0, 9):
        if testWinMove(bCopy, mark, j) and bCopy[j] == ' ':
            winningMoves += 1
    return winningMoves >= 2

Let’s rewrite the AI function to check for forks

def getComputerMove(b):
    # Check computer win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, 'X', i):
            return i
    # Check player win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, '0', i):
            return i
    # Check computer fork opportunities
    for i in range(0, 9):
        if b[i] == ' ' and testForkMove(b, 'X', i):
            return i
    #  Check player fork opportunities
    for i in range(0, 9):
        if b[i] == ' ' and testForkMove(b, '0', i):
            return i
    # Play a corner
    for i in [0, 2, 6, 8]:
        if b[i] == ' ':
            return i
    # Play center
    if b[4] == ' ':
        return 4
    #Play a side
    for i in [1, 3, 5, 7]:
        if b[i] == ' ':
            return i

Player first: 7764 cpu, 2036 draws, 200 player

CPU first:     9769 cpu, 231 draws, 0 player

That’s better! At least the CPU wins/draws every time it goes first. But the player can still force a win if they go first.

Tweaks

There are still some specific games that we are allowing the player to win. Firstly, the computer should preference center above corners to prevent the following scenario, where the computer is forced into blocking (board 4), allowing the player to fork (board 5).

imp1win1

There is also now a very special case that would still slip through the cracks if we just did the above. It is where the player has TWO squares that gives them a fork opportunity. The current algorithm finds the first one of these (board 4) and spits it out, giving the player the luxury of using the other fork (board 5)

imp1win2

To get round this, the computer needs to force the player to block an immediate three in a row, by choosing any side instead of the corner on board 4.

And so we have the final algorithm:

  1. Check for CPU win move
  2. Check for player win move
  3. Check for CPU fork opportunity
  4. Check for player fork opportunity, but choose to force a block if there are two potential forks
  5. Check center
  6. Check corners
  7. Check sides
def getComputerMove(b):
    # Check computer win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, 'X', i):
            return i
    # Check player win moves
    for i in range(0, 9):
        if b[i] == ' ' and testWinMove(b, '0', i):
            return i
    # Check computer fork opportunities
    for i in range(0, 9):
        if b[i] == ' ' and testForkMove(b, 'X', i):
            return i
    # Check player fork opportunities, incl. two forks
    playerForks = 0
    for i in range(0, 9):
        if b[i] == ' ' and testForkMove(b, '0', i):
            playerForks += 1
            tempMove = i
    if playerForks == 1:
        return tempMove
    elif playerForks == 2:
        for j in [1, 3, 5, 7]:
            if b[j] == ' ':
                return j
    # Play center
    if b[4] == ' ':
        return 4
    # Play a corner
    for i in [0, 2, 6, 8]:
        if b[i] == ' ':
            return i
    #Play a side
    for i in [1, 3, 5, 7]:
        if b[i] == ' ':
            return i

Let’s run the simulation one last time:

Player first: 8846 cpu, 1154 draws, 0 player

CPU first:     9731 cpu, 269 draws, 0 player

And we’ve done it!

Optional Extra: Making the Unbeatable Algorithm Better

Awesome! We now have an unbeatable algorithm. But could we do better?

“Better!?” you say. “It’s unbeatable, the job is done, let me go outside and play now, I’ve deserved it!”

Well, yes and no.

We’ve made an algorithm that can’t be beat. But that’s not quite the same as the most optimal algorithm. There are some minor improvements that would up the number of wins for the CPU against a random player. I leave this as an exercise.

The other tweak is that if move position doesn’t matter (e.g “any corner”, “any side”, “either of these two fork opportunities”) instead of checking them in order, you might want to pick one at random. This will give the player the illusion that the computer is playing differently each time.

Finally, refer to this  xkcd picture of the exact optimal plays for any opponent scenario. At the end of the day, the absolute “best” algorithm would implement this as if/ else statements; both unbeatable AND giving the CPU the most opportunities to win.

2 thoughts on “Python: Coding unbeatable Tic Tac Toe AI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s