How to sample a uniform random polyomino?

The following Markov chain Monte Carlo algorithm would suffice, as it satisfies detailed balance and is ergodic. It may suffer from some sort of "critical slowing down" as the proportion of moves which are legal might decay as a power of the system size, but I don't have clear intuition about this.

I don't know if there's anything better in the literature. This is a "local" move; maybe there is a "global" move which is akin to the pivot move but which still gives a correct algorithm (e.g. rotating/reflecting/translating a piece of the polyomino and re-attaching it).

I'll just refer to the objects as "polyominoes", but they could be referred to as lattice animals, or for dimensions greater than 2 as $d$-dimensional polycubes.

  1. Initialise system, perhaps starting with a "straight rod" of length $N$, starting at the origin and extending in the positive x-axis direction, or (better) starting with a $d$-dimensional cube with a few cells missing, or (better still) starting with a configuration obtained by attaching cells at random until the correct number of cells have been added (following suggestion of Brendan McKay).
  2. Choose a random occupied cell, and move it to a randomly chosen unoccupied neighbour of the remaining $N-1$ cells. (A list of the unoccupied neighbours needs to be maintained to do this efficiently. This data structure could be based on hash table but must allow insertions and deletions.)
  3. If the resulting configuration is a polyomino, accept the move, otherwise reject the move and keep the original configuration. (The condition that needs to be checked is that the cells are connected. It suffices to check that the occupied neighbours of the original cell are connected, and this could be efficiently done via a greedy graph search algorithm.)
  4. Return to step 2.

The algorithm generates translates of polyominoes, but that's trivial to partial out.

The Markov chain can generate any polyomino starting from a seed state, and is therefore ergodic. (Start with the rod seed state, and assume that the cell at the origin is the rightmost cell of an arbitrary polyomino that you want to construct. You can do this by taking the rightmost cell of the rod and building up the required configuration cell-by-cell to the left of the origin.)

The Markov chain satisfies detailed balance by inspection.

Therefore the Markov chain will sample polyominoes uniformly at random. It would take a bit of work to estimate the mixing time by running computer experiments and then coming up with a heuristic argument. I think it's unlikely that you will be able to get an exact result for the mixing time, but it may be possible to derive an upper bound.

One significant difficulty in implementation will be the efficient checking of connectivity for each attempted move. Naively, this could take CPU time $O(N)$, but in practice a greedy search algorithm should be much faster than that in the average case. It should be possible to do better by using a dynamic bounding volume hierarchy (like my SAW-tree implementation of the pivot algorithm for self-avoiding walks) or a dynamic data structure which maintains information about connectivity (can't remember the name of the appropriate data structure for fully dynamic insertions and deletions of vertices, but they are related to link/cut trees).

Better algorithms are surely possible - e.g. the idea sketched above for a global move - but I think that this will suffice to get random polyominoes with say 1000 cells, even for naive connectivity checking. If you want to do this for a million cells then we might have to think a bit harder!


Here is an approach to estimating parameters of uniformly random polyominoes of $n$ squares without generating random ones uniformly.

First define a rooted tree whose root is the single square and whose nodes at distance $m$ from the root are the $m$-square polyominoes, with no nodes further than $n$ from the root. This is fairly easy: the parent of polyomino $P$ can be $P-s$ where $s$ is the most-northern most-western square such that $P-s$ is a polyomino.

Now you need a procedure that, given a polyomino $P$ tells you how many children it has and chooses a random child. If nothing better comes to mind, just try adding a new square in each possible place and reject it if $P$ is not the parent.

Now start at the root and follow a random path down until you come to a leaf $L$ (with $m$ squares, say). At each node, choose a child uniformly at random. If $d_0,\ldots,d_{m-1}$ are the degrees (numbers of children) of the nodes along that path (excluding $L$), then $p(L)=(d_0\cdots d_{m-1})^{-1}$ is the probability that a random path ends at $L$.

Now, if $\chi(L)$ is the indicator function of "$L$ has $n$ squares", then the expectation of $\chi(L)/p(L)$ is exactly equal to the number of $n$-square polyominoes, so you can estimate that number by sampling repeatedly. In fact, for any function $f$, the sum of $f(P)$ over all $n$-vertex polyominoes is exactly the expectation of $f(P)\chi(L)/p(L)$ for a random path. This lets you estimate the expectation of $f$ by repeated sampling.

I've considered the simplest case, where all the children of a node are chosen with equal probability as a random path is traced. The theory is valid for any distribution on the children. To get good estimates you would like the probability of choosing a child to be roughly proportional to the number of $n$-square polyominoes descended from that child. Then each $n$-vertex polyomino comes out with similar probability. If that is far from the truth, the estimates may have large variance even though they have the correct expectation.

This is an example of "sequential importance sampling". Searching on that phrase will bring up some papers; look for Diaconis in particular. I've seen this work extremely well and extremely badly on different problems.