Determining the number of valid TicTacToe board states in terms of board dimension
Fun question.
I don't have enough karma to add this as a comment, so I'll offer it up as an answer, although it is (currently) an incomplete one.
Some comments
The reference you link to for the $N=3$ case does not say that there are $255,168$ valid 3-by-3 tic-tac-toe boards, but that there are $255,168$ distinct 3-by-3 tic-tac-toe games.
That is, if you make a "game tree" where each (valid) board is a node and each move is an edge, you're asking how many nodes are in the tree. The Wikipedia article states that there are $255,168$ distinct paths through the tree (from the root to a leaf), but there are substantially fewer nodes. In fact we can set an upper bound on the number of distinct 3-by-3 tic-tac-toe boards by ignoring the rules of the game and noting that there are only $3^9$ ways to fill a 9x9 board with 3 tokens (blank, X and O), which is only $19,683$. And many of those have too many Xs or Os to be valid.
(The Wikipedia article is also taking symmetry into account, which if I understand you correctly, you want to ignore.)
I happen to be working on a program that generates this game tree for entirely different reasons.
The program may be buggy, so take the numbers above with a grain of salt, but it looks right to me on inspection. I believe I've seen other pages online that corroborate the $N=3$ case, but at the time I was only trying to validate that my program was working properly, so I didn't bother the record the link.
Currently my program runs out of memory (on my little netbook) when I try to generate the game tree for $N=4$, but I imagine that's fixable with a little bit beefier hardware and/or a little optimization of the program for memory footprint (currently I'm storing a lot of metadata about each board state).
Circumstances permitting, I'll try come back and update this answer with more information if and when I have some to share, but I wanted to put some hard numbers (and specific constraints) on the table because I think it clears up some of the confusion in the comment thread.
Assumptions
To clarify the constraints on the problem, here's what I'm assuming:
- We're interested in counting the number of valid "game states" on an $N$-by-$N$ tic-tac-toe board.
- We're ignoring symmetry, so that a board with one
X
in the upper-left corner is considered a distinct board from the one with anX
in the lower-right corner. X
always moves first.- The game stops with either player makes a horizontal, vertical or diagonal line of length $N$. Any boards that can only be reached by continuing to play after one of the players has made a line are considered invalid and ignored.
- (The game also stops when we run out of free spaces to move, of course.)
Some data based on enumeration
Under these constraints, as you wrote, we can confirm that:
- for $N=1$ there are $2$ valid and distinct boards
- for $N=2$ there are $29$ valid and distinct boards
Based on (programmatic) inspection, I think we can say that:
- for $N=3$ there are $5,478$ valid and distinct boards
- for $N=4$ there are $9,722,011$ valid and distinct boards
If you break these down by ply (turn):
N: 1 2 3 4
------- - -- ---- -------
ply 0: 1 1 1 1
ply 1: 1 4 9 16
ply 2: - 12 72 240
ply 3: - 12 252 1680
ply 4: - 0 756 10920
ply 5: - - 1260 43680
ply 6: - - 1520 160160
ply 7: - - 1140 400400
ply 8: - - 390 895950
ply 9: - - 78 1433520
ply 10: - - - 1962576
ply 11: - - - 1962576
ply 12: - - - 1543080
ply 13: - - - 881760
ply 14: - - - 333792
ply 15: - - - 83440
ply 16: - - - 8220
======= = == ==== =======
TOTAL: 2 29 5478 9722011
-------
I don't see an obvious formula for the sequence $(2,29,5478,9722011,...)$, but a few interesting (IMO) observations about this:
$N=3$ is the smallest board for which player
O
can win$N=3$ is the smallest board that can end in a tie game
$N=2$ is likely the only board that cannot be filled (
X
is guaranteed to win on the third move, leaving one slot unfilled)Both of the even examples have two plies in a row with the exactly same number of boards (for $N=2$ plies 2 and 3 have $12$ boards and for $N=4$ plies 10 and 11 have $1,962,576$). These are also the "widest" plies in their respective trees. (The same is true for $N=1$, but I imagine that's a degenerate case.)
(This didn't hold for $N=4$.)
It may just be the law of small numbers, but I notice that the "widest" tier of the game tree is the one where there are $N$ empty spaces left on the board. For $N=1$, this is ply 0 with $1$ board, for $N=2$ this is ply 2 with $12$ boards and for $N=3$ this is ply 6, with $1,520$ boards.and of course, at $N=1$, player
O
doesn't even get to move.
Looking at upper bounds
By the way, as a very rough sanity check, I compared the number of valid boards with $3^{(N^2)}$ (the number of distinct ways to fill an $N\times{}N$ board with 3 symbols):
N: 1 2 3 4
-------- ---- ---- ------- ----------
# Valid: 2 29 5478 9722011
3^(N^2): 3 81 19683 43046721
% Valid: 66.7 35.8 27.8 22.6
We can get a tighter upper bound if we look at the number of ways to fill an $N\times{}N$ board with $count(X) = count(O)$ or $count(X) = count(O)+1$.
That's actually not the hard to figure out if you change your thinking a little bit. Instead of alternating between X and O, imagine you put down all the Xs first, then all the Os.
On the zeroth ply, there are no Xs or Os, so we always have 1 board. (Note that this is $N^2$ choose $0$, written ${{N^2} \choose 0}$).
On the first ply, there is exactly one X, so we have ${{N^2} \choose 1}$ distinct boards.
On the second ply, there is exactly one X and exactly one O, so we have ${{N^2} \choose 1} \times {((N^2)-1) \choose 1}$ boards.
On the third ply, there are two Xs and one O, so we have ${{N^2} \choose 2} \times {{((N^2)-2)} \choose 1}$ boards.
On the fourth ply, there are two Xs and two Os, so we have ${{N^2} \choose 2} \times {{((N^2)-2)} \choose 2}$ boards.
And so on.
In the general case
Note that on ply $p$ there will be $floor({{(p+1)}\over{2}})$ Xs and $floor({{p}\over{2}})$ Os on the board, so let's say:
- $x = floor({{(p+1)}\over{2}})$
and
- $o = floor({{p}\over{2}})$
Note that:
There are ${{N^2} \choose x}$ ways to place $x$
X
marks on an $N\times{}N$ board.There are ${{N^2 - x} \choose o}$ ways to place $o$
O
marks on an $N\times{}N$ board that is already filled with $x$X
s.
Hence ignoring winners there are:
${{N^2} \choose x} \times {{N^2 - x} \choose o}$
distinct boards at ply $p$. Or (plugging the definitions of $x$ and $o$ from above):
${{N^2} \choose {floor({{p+1}\over{2}})}} \times {{N^2 - {floor({{p+1}\over{2}})}} \choose {floor({p \over 2})}}$
So an upper bound on the number of distinct boards would the be summation of that ugly formula over $p = 0$ to $p = N^2$.
I imagine that the clever application of algebra could dramatically simplify that expression (especially when you plug the definition of $n \choose k$ in for the choose notation).
To get an even better count we could take into account the winning boards.