Is there a simpler, more abstract proof of the Cayley-Hamilton theorem for matrices?

Here is my proof of the Cayley Hamilton theorem. I'll share the intuition behind it first:

Intuition in a Nutshell: For any endomorphism $\Phi : V \rightarrow V$, we have a factorization of the determinant $\text{det}(\Phi) I$ into the adjugate and the matrix itself: $$ \text{det}(\Phi) I = \text{adj}(\Phi) \circ \Phi$$ We want to use this to get a factorization of the characteristic polynomial $p(t)$ of $\phi$ into some polynomial analogous to the adjugate and a linear term $t - \phi $: $$p(t) = f(t)(t - \phi)$$ These two factorizations are analogous, and in fact, if we get the formality right, we can view these as corresponding factorizations in isomorphic rings $\text{End}(V \otimes k[t])$ and $\text{End}(V) \otimes k[t]$.

Let's try to work this out a little more formally. The main question is, what are the two isomorphic rings I mentioned in which these are corresponding factorizations?

Let $V$ be a finite dimensional vector space over a field $k$. One of the rings is $\text{End}_k(V)[t] = \text{End}_k(V) \otimes k[t]$. The characteristic polynomial $p(t)$ of $\phi \in \text{End}_k (V)$ naturally lives in $\text{End}_k(V)[t]$ from the natural map $\text{End}_k(V) \rightarrow \text{End}_k(V) \otimes_k k[t]$. In other language, we view $t \text{Id}_V - \phi $ as having endomorphisms as coefficients, and then take the determinant, which is then in $\text{End}_k (V)[t]$.

The other ring is $\text{End}_{k[t]}(V \otimes_k k[t])$. $\Phi := 1 \otimes t - \phi \otimes 1 $ is an element in this ring, and we have a factorization $\text{det}(\Phi) 1_{V \otimes_k k[t]} = \text{adj}(\Phi) \Phi$.

In the isomorphism $$\text{End}_k ( V \otimes_k k[t]) \cong \text{End}_k (V)[t]$$ We have corresponding elements $$\Phi \leftrightarrow t - \phi$$ and $$\text{det}(\Phi) \leftrightarrow p(t)$$ Therefore, the factorization $\text{det}(\Phi) 1_{V \otimes_k k[t]} = \text{adj}(\Phi) \Phi$ corresponds to a factorization $p(t) = f(t)(t-\phi)$ in $\text{End}_k (V) [t]$. And that's the whole idea!


If you want a more formal version, and a construction of the claimed isomorphism, read on!

Theorem: Let $V$ be a finitely generated $k$-module. If $\phi : V \rightarrow V$ is a $k$-linear map, then the evaluation homomorphism $\text{ev}_{\phi} : k[t] \rightarrow \text{End}_k (V)$ sends the characteristic polynomial $\text{char}(\phi)$ to $0$.

Let's start by constructing an isomorphism $F : \text{End}_{k} (V)[t] \rightarrow \text{End}_{k[t]} (V \otimes_k k[t])$ as follows. We have isomorphisms $$\text{End}_{k[t]} (V \otimes_k k[t]) \cong \text{Hom}_k (V, \text{Hom}_{k[t]}(k[t], V \otimes_k k[t])) \cong \text{Hom}_k(V, V \otimes_k k[t])$$

These isomorphisms can be established by creating canonical maps in both directions and showing that they are inverse to each other. Now we have a canonical map in a single direction,

$$\text{End}_k (V) \otimes_k k[t] \rightarrow \text{Hom}_k(V, V \otimes_k k[t])$$

sending $\phi \otimes t^n$ to the map sending $v$ to $\phi(v)t^n$. This is injective, and surjective since $V$ is finitely generated. Composing these isomorphisms gives an isomorphism $F : \text{End}_{k} (V)[t] \rightarrow \text{End}_{k[t]} (V \otimes_k k[t])$.

Now we argue as before. View $t - \phi$ as a $k[t]$-linear endomorphism of $V \otimes_k k[t]$. Under the isomorphism $F$, $\text{char}(\phi)$ maps to $\text{det} (t - \phi) 1_{V \otimes_k k[t]} )$ and $F ( t - \phi ) = t - \phi$. $t - \phi$ divides $\text{det}(t - \phi) 1_{V \otimes_k k[t]}$ in $\text{End}_{k[t]} (V \otimes_k k[t])$, since $\text{det} (t - \phi) 1_{V \otimes_k k[t]} = \text{adj}(t - \phi) (t - \phi)$, where $\text{adj}(t - \phi)$ is the adjugate matrix. Therefore, $t - \phi$ divides $\text{char}(\phi)$ in $\text{End}_{k}(V)[t]$. So $\text{char}(\phi)$ has $\phi$ as a root in $\text{End}_k(V)$, so that the evaluation homomorphism $\text{ev}_{\phi} : k[t] \rightarrow \text{End}_k (V)$ sends the characteristic polynomial $\text{char}(\phi)$ to $0$.


This has no pretense to be "The" answer!

I am no algebraist but I remember a nice proof I was taught when I was I student, I thought I'd share it.

In a nutshell: True for diagonalizable matrices, then use "algebraic continuation".

Let's write down some details.

Lemma ("algebraic continuation"): Let $k$ be an infinite field. Let $P, Q \in k[X_1, \dots, X_n]$ be polynomials of $n$ variables with $Q \neq 0$ . If $P$ vanishes on the set $\{x \in k^n ~\colon~ Q(x) \neq 0\}$, then $P = 0$.

This lemma expresses that "non-empty open sets are dense in the Zariski topology". It's not hard to show (I'll give you a hint if you want).

Now:

Theorem (Cayley-Hamilton): Let $R$ be a commutative unital ring. Let $A \in M_n(R)$ be a square matrix and denote by $\chi_A(X) \in R[X]$ its characteristic polynomial. Then $\chi_A(A)$ is the zero matrix.

Let's give a proof when $R = k$ is an infinite field for the moment. By the "algebraic continuation" lemma, it is enough to show that the theorem is true when $A$ lies in some "dense open set". More precisely, each coefficient of the matrix $\chi_A(A)$ is a polynomial in the $n^2$ coefficients of $A$. It is enough to show that it vanishes on some set $\{Q \neq 0\}$, where $Q$ is a nonzero polynomial in $n^2$ variables. Let's take $Q(A) = \mathrm{Disc}(\chi_A)$ (the discriminant of the polynomial $\chi_A$). The set where $Q \neq 0$ consists precisely of matrices $A$ whose eigenvalues are all distinct in an algebraic closure $\bar{k}$ of $k$. Such matrices are diagonalizable over $\bar{k}$ so it is easy to check that $\chi_A(A)$ = 0 (I'll let you do that).

We're done!

Wait, how does this extend to an arbitrary unital ring? Well, each of the coefficients of the matrix $\chi_A(A)$ is actually a polynomial in the $n^2$ coefficients of $M$ with integer coefficients. These polynomials must be zero because we showed that Cayley-Hamilton holds for $\mathbb{Q}$ hence $\mathbb{Z}$. (NB: I think some people would say something like "$\mathbb{Z}$ is an initial object in the category of unital rings" or whatever).