Predicting results of a game (TicTacToe), with players using fixed strategies

mathematical modelingprobabilityproblem solving

How would you predict the distribution of wins/losses/ties for two computer players in a game of TicTacToe Where each player has a fixed strategy for playing?

In my situation both players are using the same strategy which is as follows:

1) If there is a winning move, go there.

2) Else, If there is a winning move for the opponent, go there (stop the opponent from winning.)

3) Else, choose a remaining open move at random (from a uniform distribution).

Player-1 always goes first, and Player-2 second.

After doing some simulations with this strategy I got the following results:
For 100,000 simulated games:

Ties: 51,446 (51.45%)

Player-1 wins: 31,011 (31.01%)

Player-2 wins: 17,543 (17.54%)

So how would one begin to model this? My instinct would be to use Markov chains.

Any resources for where to look, or study up on would also be very helpful.

Best Answer

You could definitely use a Markov chain for this, but given that there is no possibility of repeating a position in Tic Tac Toe, this is a bit overkill - elementary techniques work just fine for this sort of thing, since the answer is just successively taking weighted averages of a bunch of probabilities. For instance, if you had this position (with X to move): $$\begin{array}{ccc}X & O & X\\ O & * & *\\ *&* & *\\ \end{array}$$ where the $*$'s are empty spaces, you would find that the probability of winning/losing/drawing are just the average of the respective probabilities of each of the $5$ positions $X$ might randomly choose - which, could each be computed the same way. There are not so many positions, at least in computational terms, so just programming a computer to run this calculation is not so intensive, as long as you save the results of each computation (i.e. use memoization).

In concrete terms, the algorithm to compute this quantity is simply: let $L$ represent the current board state. First, check if someone has won in $L$ or if a draw has been reached - the probability will be either $0$ or $1$ in these cases. If not, compute every move that the player whose turn it is might reasonably make (i.e. if they can win, a winning move. If not, but their opponent could win, a move to block that. If not, any legal move). Compute the probabilities of winning from those states and take their average. Save the result. Note that this method will never touch unreachable states.

In Mathematica, this is implemented as follows - one can modify the ReasonableMoves function for other strategies - or write this in other languages. Since it seems like you have a working simulation already (unless you did 100,000 trials by hand), you could probably easily modify it to give an exact answer instead of an approximation, as long as your language has some easy way to support exact rational arithmetic and an associative container for memoizing positions.

IsWinForPlayer[p_, l_] := With[{occupied = Map[# == p &, l, {2}]},
   Or[Or @@ (And @@ # & /@ occupied), 
    Or @@ (And @@ # & /@ Transpose[occupied]), 
    occupied[[1, 1]] && occupied[[2, 2]] && occupied[[3, 3]], 
    occupied[[1, 3]] && occupied[[2, 2]] && occupied[[3, 1]]]];
IsDraw[l_] := Plus @@ (Plus @@ Map[Abs, l, {2}]) == 9;
WhoseTurn[l_] := If[Plus @@ (Plus @@ l) == 0, 1, -1];
EmptyPositions[l_] := Position[l, 0, {2}];
ReasonableMoves[l_] := 
  Module[{empty, player, possible, winning, opponentWin},
   empty = EmptyPositions[l];
   player = WhoseTurn[l];
   possible = ReplacePart[l, # -> player] & /@ empty;
   winning = Select[possible, IsWinForPlayer[player, #] &];
   If[Length[winning] > 0, winning];
   opponentWin = 
    Select[empty, 
     IsWinForPlayer[-player, ReplacePart[l, # -> -player]] &];
   If[Length[opponentWin] > 0, 
    Return[ReplacePart[l, # -> player] & /@ opponentWin]];
   possible
   ];
StartingPosition = {{0, 0, 0}, {0, 0, 0}, {0, 0, 0}};
ProbabilityOfWin[p_, l_] := 
  ProbabilityOfWin[p, l] = 
   Which[IsWinForPlayer[p, l], 1, IsWinForPlayer[-p, l] || IsDraw[l], 
    0, True, Mean[ProbabilityOfWin[p, #] & /@ ReasonableMoves[l]]];

It gives a probability of $347/1680$ for the first player winning and $169/1680$ for the second player and only takes about 1 second of computation for each on my laptop (in Mathematica - a language not known for speed). These numbers seem a lot lower than your simulation (which should have been very accurate for the number of trials) - so there might be some discrepancy in what the actual strategy used was - but the method generalizes to any strategy. This method can also be modified to find the optimal strategy by instead computing, for each position, whether it is a win, draw, or loss under optimal play by looking at whether each legal move from that position is a win, draw, or loss.