Mean distance between matrix entries

Disclaimer. This is by no means a complete and fully rigorous answer, but should be a pretty good starting point.

First, unless I misunderstand what you're trying to capture with your definition of neighbour, shouldn't it be that $(x_1,y_1)\neq(x_2,y_2)$ are neighbours if and only if one of the following occurs

$\Delta x=\Delta y=1$;
$\Delta x=1$ and $\Delta y=0$; or
$\Delta x=0$ and $\Delta y=1$.

Otherwise $\Delta x+\Delta y=2$ could allow $(1,1)$ to be a neighbour of $(1,3)$, which would mean that there are actually $10$ non-neighbour combinations in a $3\times 3$ matrix.

If my comment above is actually what you're interested in, then the number $C_n$ of possible combinations of non-neighbouring matrix entries in an $n\times n$ matrix is the number of ways to place 2 nonattacking kings on an $n\times n$ board, which is apparently given by the explicit formula $$C_n=\frac{(n - 1)(n - 2)(n^2 + 3n - 2)}2,\qquad n\in\mathbb N.$$ Note that this means that $C_3=16$ instead of $18$. This formula is confirmed by the following MatLab code (you can play around with the parameter $N$ to change the dimension, and the output $C$ is the number of pairs we're looking for).

N=3; % Dimension of matrix

X=zeros(2,N^2); % 2 x N^2 array whose columns represent matrix entries

% The loop below generates all matrix entries
ctr=1;
for(i=1:N)
    for(j=1:N)
        X(1,ctr)=i;
        X(2,ctr)=j;
        ctr=ctr+1;
    end
end

% All possible combinations of two entries
C=nchoosek(N^2,2);

% The loop below goes through all combinations.
% If a combination contains neighbouring entries, we remove one from C.
% After going through all combinations, the number C left is precisely
% the number of non-neighbouring combinations.
for(i=1:N^2)
    for(j=i+1:N^2)
        dx=abs(X(1,i)-X(1,j));
        dy=abs(X(2,i)-X(2,j));
        if((dx==1 && dy==0) || (dx==0 && dy==1) || (dx==1 && dy==1))
            C=C-1;
        end
    end
end
C

I suspect that a nice exact formula for the average Euclidean distance of a uniform non-neighbouring pair for arbitrary $n$ may not exist. However, it could still be possible to get a tractable asymptotic answer. In this line of thought, my suspicion is that your random variable can be thought of as a finite-dimensional approximation of the average distance between two random points in a square, which is apparently equal to $$\gamma:=\frac{2+\sqrt{2}+5\log(1+\sqrt{2})}{15}.$$

More precisely, we can represent an $n\times n$ matrix as a square grid inside the unit square (see here for an example with $n=13$), where the sides of the smaller squares in the grid are of size $n^{-1}$, and each smaller square represents a matrix entry.

Given two points $z_1=(x_1,y_1)$ and $z_2:=(x_2,y_2)$ sampled uniformly from the unit square in $\mathbb R^2$, we can obtain two randomly selected matrix entries by looking at which of the smaller squares $z_1$ and $z_2$ land into. Of course, this does not take into account your requirement that the matrix entries be non-neighbours, but for very large $n$, the probability that two random points $z_1$ and $z_2$ in $[0,1]^2$ land in neighbouring squares converges to zero, so I expect that this approximation should work out regardless.

If this approximation does indeed work, then we can expect that the average euclidean distance you're looking for, at least for large $n$, is approximately equal to $$n\cdot\gamma= n\cdot \frac{2+\sqrt{2}+5\log(1+\sqrt{2})}{15}.$$

After doing some MatLab simulations, it appears that this is a pretty good approximation, as shown in the picture below (the purple line is the asymptotic value $n\cdot \gamma$, and the yellow $+$'s are the actual averages computed using a very similar code to the one I used above to compute the values of $C_n$).

enter image description here

Disclaimer: This approach is a work in progress, and it is not yet clear whether it leads to a nice solution. But perhaps it inspires someone else to a solution.

The mean distance between a pair of points, including neighbours, is $$\frac12\cdot\frac{1}{\binom{n^2}{2}}\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2}=\frac{1}{n^2(n^2-1)}\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2},$$ where all indices run from $1$ to $n$. Here we count every pair twice, and hence divide the whole thing by $2$. The contribution from the pairs of neighbours is as follows: \begin{align*} \text{corners}:&&4\times(2\times1+1\times\sqrt{2}) =&&8+4\sqrt{2},\\ \text{sides}:&&4(n-2)(3\times1+2\times\sqrt{2}) =&&4(3+2\sqrt{2})n-8(3+2\sqrt{2}),\\ \text{center}:&&(n-2)^2(4\times1+4\times\sqrt{2}) =&&4(1+\sqrt{2})n^2-16(1+\sqrt{2})n+16(1+\sqrt{2}). \end{align*} So the total number of pairs we overcounted is $$4\times3+(n-2)\times5+(n-2)^2\times8=8n^2-54n+78,$$ and the sum of their distances is $$4(1+\sqrt{2})n^2-4(1+2\sqrt{2})n+4\sqrt{2}.$$ Now it remains to determine the value of the sum $$S(n):=\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2},$$ where the indices all range from $1$ to $n$. This seems nice enough that someone might have written down some thoughts on it before...

Mean distance between matrix entries

Tags:

Linear Algebra

Probability Theory

Matrices

Related

Recent Posts