Question on upper triangular matrix with complex eigenvalues with modulus less than 1

Here is what I had in mind when I added that exercise to the third edition of Linear Algebra Done Right:

By Schur's Theorem (which is in the same section as this exercise), there is an orthonormal basis of $V$ such that $T$ has an upper-triangular matrix $A$ with respect to that basis. Orthonormality is important so that we can compute norms. We will work with $A$ instead of $T$ and $z \in \mathbf{C}^n$ instead of $v$.

The entries on the diagonal of $A$ are the eigenvalues of $T$. Thus all the diagonal entries of $A$ have absolute value less than $1$.

If we replace each entry in $A$ with its absolute value and each coordinate of $z$ with its absolute value, then $\|A^mz\|$ gets larger or remains the same for each positive integer $m$, and $\|z\|$ remains the same. Thus we can assume without loss of generality that each entry of $A$ is nonnegative.

If we now replace each entry on the diagonal of $A$ with the largest element on the diagonal of $A$, then $\|A^mz\|$ gets larger or remains the same for each positive integer $m$. Thus we can assume without loss of generality that there exists a nonnegative number $\lambda < 1$ such that every entry of the diagonal $A$ equals $\lambda$.

We can write $$ A = \lambda I + N, $$ where $N$ is an upper-triangular matrix whose diagonal entries all equal $0$. Thus $N^n = 0$.

Let $m$ be a positive integer with $m > n$. Then

\begin{align} A^m &= (\lambda I + N)^m \\[6pt]&= \lambda^m I + m \lambda^{m-1} N + \tfrac{m(m-1)}{2} \lambda^{m-2} N^2 \\[6pt] &\quad + \tfrac{m(m-1)(m-2)}{2 \cdot 3} \lambda^{m-3} N^3 + \dots + \tfrac{m(m-1) \dotsm (m - n + 1)}{(n-1)!} \lambda^{m-n+1} N^{n-1}, \end{align}

where the other terms in the binomial expansion do not appear because $N^k = 0$ for $k \ge n$. In the expression above, the coefficients of the $n$ matrices $I$, $N$, $N^2$, $N^3$, $\dots$, $N^{n-1}$ are all bounded by $$ \frac{m^n \lambda^m}{\lambda^{n-1}}. $$ Think of $n$ and $\lambda$ as fixed. Then the limit of the expression above as $m \to \infty$ equals $0$ (because $0 \le \lambda < 1$).

Because the sum above has only a fixed number $n$ of terms, we thus see that by taking $m$ sufficiently large, we can make all the entries of the matrix $A^m$ have absolute value as small as we wish. In particular, there exists a positive integer $m$ such every entry of $A^m$ has absolute value less than $\epsilon/n$.

The definition of matrix multiplication and the Cauchy-Schwarz Inequality imply that for each $j$, the entry in row $j$, column $1$ of $Az$ has absolute value at most $\epsilon \|z\|/\sqrt{n}$. Thus $\|Az\| \le \epsilon \|z\|$, as desired.

This exercise is, in my opinion, one of the harder exercises in the third edition of Linear Algebra Done Right if one uses only the tools available at that stage of the book. The proof becomes much clearer and cleaner if one uses the tools associated with the spectral radius in a Banach algebra, which shows the power of those tools.


I am going to suppose you are allowed to use the direct sum decomposition of the vector space into generalised eigenspaces for$~T$, which I believe is something Axler treats fairly early.

The norm of a vector cannot exceed the sum of the norms of its components along this direct sum decomposition, so it will suffice to show that each of these components becomes small for sufficiently high exponent$~m$ (giving a precise value for the exponent required is a gory detail that I won't go into).

Therefore, by restricting to one generalised eigenspace at a time, it suffices to prove the result in the case there is just a single eigenvalue$~\lambda\in\Bbb C$, which has $|\lambda|<1$. Then by definition of the generalised eigenspace for$~\lambda$, which due to our restriction is the whole space, the operator $N=T-\lambda I$ is nilpotent, say $N^n=0$. Now for a vector $v$ and any $m\geq n-1$ we have by the binomial formula ($\lambda I$ commutes): $$ T^m(v)=(\lambda I+N)^m(v)=\sum_{k=0}^m\binom mk\lambda^{m-k}N^k(v) =\lambda^{m-n+1}\sum_{k=0}^{n-1}\binom mk\lambda^{n-1-k}N^k(v) $$ since $N^k=0$ for $k\geq n$. Note the the summand has a fixed number $n$ of terms, in each of which the vector $\lambda^{n-1-k}N^k(v)$, which his independent of$~m$, is multiplied by the scalar $\binom mk$, which depends on$~m$ as a polynomial of function of degree$~k$. The exponential factor $\lambda^{m-n+1}$ before the sum has absolute value decreasing with$~m$ sufficiently rapidly to overcome this polynomial growth, and the norm of the result therefore tends to$~0$ as $m\to\infty$.