Given a knight on an infinite chess board that moves randomly, what's the expected number of distinct squares it reaches in 50 moves?
An exact solution can be acquired by computing $d_{x, y}(n)$ which is the number of distinct paths reaching $(x, y)$ for the first time after $n$ steps. The total number of paths $p_{x,y}(n)$ which contains $(x, y)$ after $n$ steps then has relationship:
$$p_{x, y}(n) = 8p_{x, y}(n-1) + d_{x, y}(n)$$
Using this we can express the expected number of distinct squares per path after $n$ steps, as counting the total number of distinct squares for each path and the total number of distinct paths crossing a square is equivalent when summing over all paths or all squares.
Then, saving some computation time we find by four-fold rotational symmetry that the expected number of distinct squares per path after $n$ steps is equivalent to:
$$1 + \frac{4}{8^{n}}\sum_{(x > 0, y \geq 0)} p_{x, y}(n)$$
How do we compute $d_{x, y}(n)$? We do it using matrices $M_{x, y}(n)$ which are initially zero everywhere except at $(0, 0)$ where they have value $1$. Then we apply a kernel convolution encoding the possible knight moves:
$$M'_{x, y}(n)=\begin{bmatrix}0&1&0&1&0\\ 1&0&0&0&1\\ 0&0&0&0&0\\ 1&0&0&0&1\\ 0&1&0&1&0\\\end{bmatrix}\star M_{x, y}(n-1)$$
Then $d_{x, y}(n)$ is the element at $(x, y)$ in $M'_{x, y}(n)$, and we set $(x, y)$ in the above matrix to zero to get $M_{x, y}(n)$ for future computation. Because we repeatedly set that coefficient to zero after each iteration it is that we count the number of distinct paths reaching $(x, y)$ for the first time, rather than all the paths that reach $(x, y)$ after $n$ steps. Unfortunately this does mean that for all possible $(x, y)$ we must compute these matrices a total of $n$ times.
I've implemented the above in Rust:
use gmp::mpz::Mpz;
use gmp::mpq::Mpq;
use rayon::prelude::*;
fn knight_step(d: Vec<Mpz>, w: i32) -> Vec<Mpz> {
let mut nd = vec![Mpz::new(); (w*w) as usize];
for &(dx, dy) in &[(-2, -1), (-2, 1), (-1, -2), (-1, 2), (1, -2), (1, 2), (2, -1), (2, 1)] {
let rx = (-dx).max(0)..(w-dx).min(w);
let ry = (-dy).max(0)..(w-dy).min(w);
for x in rx {
for y in ry.clone() {
let cx = x + dx;
let cy = y + dy;
nd[(x*w + y) as usize] += &d[(cx*w + cy) as usize];
}
}
}
nd
}
fn count_distinct(n: usize) -> Mpz {
let w = 4*n + 1; // Size of a side of the board.
let z = 2*n; // Origin is at (z, z);
let mut total_distinct = Mpz::new();
for x in z+1..w {
println!("{}", x);
let subproblems: Vec<_> = (z..w).collect::<Vec<usize>>().par_iter().map(|y| {
let mut d = vec![Mpz::new(); w*w];
d[z*w + z] = Mpz::from(1);
let mut p = Mpz::new();
for _ in 0..n {
d = knight_step(d, w as i32);
p = 8i64*p + &d[x*w + y];
d[x*w + y] = 0.into();
}
p
}).collect();
for s in &subproblems {
total_distinct += 4i64*s;
}
}
total_distinct
}
fn main() {
let n = 50;
let q = Mpq::ratio(&count_distinct(n), &Mpz::from(8).pow(n as u32));
println!("{}", q + &1.into());
}
Giving the exact result:
$$\frac{455207943209697100946821497094086408107332219}{11150372599265311570767859136324180752990208} \approx 40.82446$$
My experiments in Mathematica give $40.06$ as the average number of distinct cells (I've done several tries for $100'000$ trials).
I'm counting the initial square though, because it makes the program more simple.
I wouldn't know how to approach this problem theoretically, but it's simple enough to do tests. Knight's moves shift it $2$ cells in one direction and $1$ cell in the other direction. Which gives us $8$ options on an infinite board.
Here's the code I used and a sample of the results:
Tm = 100000;
Ds = Table[1, {t, 1, Tm}];
Do[
Nm = 50;
P = Table[{0, 0}, {n, 1, Nm}];
M = {{1, 2}, {-1, 2}, {1, -2}, {-1, -2}, {2, 1}, {2, -1}, {-2,
1}, {-2, -1}};
Do[R = RandomInteger[{1, 8}];
P[[n + 1]] = P[[n]] + M[[R]], {n, 1, Nm - 1}];
Ds[[t]] = CountDistinct[P], {t, 1, Tm}];
Histogram[Ds]
N[Mean[Ds], 10]
Note that the distribution is asymmetrical.
Increasing the number of steps to $100$ I get around $77.36$. Which agrees well with the $4/5$ ratio.
Made $1'000'000$ tests with $100$ steps and got $77.38$, so the numbers don't change much for larger samples.
I wonder how we can get the $4/5$ estimate theoretically.
In general, this falls under the topic of random walks and their return probabilities.
Edit It would be better to set $N_m=51$ in my program, then it corresponds to $50$ moves and the resulting average $40.82$ fits well with the other answer.