Prove that the Sylvester equation has a unique solution when $A$ and $-B$ share no eigenvalues
The first implication is Bézout's identity for polynomials. It's an equivalent for the Euclidean domain of polynomials of the ordinary one about coprime integers $x$ and $y$ having integers $a$ and $b$ so that $ax+by=1$.
The second one can be seen inductively. $g(A)$ is a sum of monomials $A^k$, so by linearity it suffices to prove that $A^kX=X(-B)^k$ for integer $k$ at least $1$ (the constant term is obvious, since $I$ commutes with $X$). This follows by induction:
- The basis case is $AX=-XB$, which we already have.
- If it is true for $k$ (viz. $A^kX=X(-B)^k$), then $$A^{k+1}X = A(A^kX)=A(X(-B)^k) = (AX)(-B)^k = (-XB)(-B)^k = X(-B)^{k+1},$$ where the second equality uses the induction hypothesis and the third uses the basis case.
Hence it is true for all integer $k\geq 1$, and the implication follows.
This does not exactly answer the original question but provides an alternative proof that seems simpler than the one on Wikipedia as of October 2, 2020.
Theorem. Given matrices $A\in \mathbb{C}^{m\times m}$ and $B\in \mathbb{C}^{n\times n}$, the Sylvester equation $AX-XB=C$ has a unique solution $X\in \mathbb{C}^{m\times n}$ for any $C\in\mathbb{C}^{m\times n}$ if and only if $A$ and $B$ do not share any eigenvalue.
Proof. The equation $AX-XB=C$ is a linear system with $mn$ unknowns and the same amount of equations. Hence it is uniquely solvable for any given $C$ if and only if the homogeneous equation $$ AX-XB=0 $$ admits only the trivial solution $0$.
Assume that $A$ and $B$ do not share any eigenvalue. Let $X$ be a solution to the abovementioned homogeneuous equation. Then $AX=XB$, which can be lifted to $A^kX = XB^k$ for each $k \ge 0$ by mathematical induction. Consequently, $$ p(A) X = X p(B) $$ for any polynomial $p$. In particular, let $p$ be the characteristic polynomial of $A$. Then $$p(A)=0$$ due to the Cayley-Hamilton theorem; meanwhile, the spectral mapping theorem tells us $$ \sigma(p(B)) = p(\sigma(B)), $$ where $\sigma(\cdot)$ denotes the spectrum of a matrix. Since $A$ and $B$ do not share any eigenvalue, $p(\sigma(B))$ does not contain $0$, and hence $p(B)$ is nonsingular. Thus $X= 0$ as desired. This proves the "if" part of the theorem.
Now assume that $A$ and $B$ share an eigenvalue $\lambda$. Let $u$ be a corresponding right eigenvector for $A$, $v$ be a corresponding left eigenvector for $B$, and $X=u{v}^*$. Then $X\neq 0$, and $$ AX-XB = A(uv^*)-(uv^*)B = \lambda uv^*-\lambda uv^* = 0. $$ Hence $X$ is a nontrivial solution to the aforesaid homogeneous equation, justifying the "only if" part of the theorem. Q.E.D.
Remark. The theorem remains true if $\mathbb{C}$ is replaced by $\mathbb{R}$ everywhere. The proof for the "if" part is still applicable; for the "only if" part, note that both $\mathrm{Re}(uv^*)$ and $\mathrm{Im}(uv^*)$ satisfy the homogenous equation $AX-XB=0$, and they cannot be zero simultaneously.