How to efficiently sample edges from a graph in relation to its spanning tree

While the other answer is correct, it requires the computation of $|E| + 1$ many determinants. There is a faster route when $|E|$ is large. The first thing to note is Kirchoff's theorem which states that if $T$ is a uniform spanning tree then $$P(e \in T) = \mathscr{R}(e_- \leftrightarrow e_+)$$ where $e = \{e_-, e_+\}$ and $\mathscr{R}(a \leftrightarrow b)$ is the effective resistance between $a$ and $b$ when each edge is given resistance $1$. This implies that the probability an edge is sampled in your process is $$\mathscr{R}(e_- \leftrightarrow e_+)/(|V| - 1).$$

Thus we only need to compute the effective resistance.

If we let $L$ denote the graph Laplacian and $L^+$ to be its Moore-Penrose pseudoinverse, then

$$\mathscr{R}(a \leftrightarrow b) = (L^+)_{aa} + (L^+)_{bb} - 2 (L^+)_{ab}. $$

(See this master's thesis for some nice discussion and references.)

Thus, the only computational overhead for computing the marginals is computing a single psuedoinverse. Depending on how large $|E|$ is, this may be faster than computing $|E|$ many determinants.

EDIT: some discussion on complexity

The Pseudoinverse of an $n \times n$ matrix can be done in $O(n^3)$ time. So computing $L^+$ takes $O(|V|^3)$ time. We have to compute this for $|E|$ many edges, so the above computes all marginals in $O(|E| |V|^3)$ time. Conversely, a determinant can be done in, say, $O(n^{2.3})$ time. So the other answer has complexity $O(|E|^2 |V|^{2.3}).$ Since $G$ is connected, $|E| \geq |V|-1$ and so this algorithm is always faster (asymptotically, at least).


Let $\tau(G)$ denote the number of spanning trees in $G$, and let $G \bullet vw$ denote edge contraction: it is the multigraph in which adjacent vertices $v$ and $w$ are replaced by a single vertex $x$, and all edges incident to either $v$ or $w$ are changed to be adjacent to $x$.

The spanning trees of $G$ containing edge $vw$ are in bijection with the spanning trees of $G \bullet vw$, and so the probability that your process will return $vw$ is $$\frac{\tau(G \bullet vw)}{\tau(G)} \cdot \frac1{|V(G)|-1}.$$ We can efficiently compute $\tau(H)$ for any multigraph graph $H$ using Kirchhoff's matrix tree theorem.

(Rather than dealing with $G\bullet vw$, we could also count the spanning trees containing $vw$ as $\tau(G) - \tau(G-vw)$, but that's slightly less efficient, because the determinants are one bigger.)