Minimal polynomials and characteristic polynomials

The minimal polynomial is quite literally the smallest (in the sense of divisibility) nonzero polynomial that the matrix satisfies. That is to say, if $A$ has minimal polynomial $m(t)$ then $m(A)=0$, and if $p(t)$ is a nonzero polynomial with $p(A)=0$ then $m(t)$ divides $p(t)$.

The characteristic polynomial, on the other hand, is defined algebraically. If $A$ is an $n \times n$ matrix then its characteristic polynomial $\chi(t)$ must have degree $n$. This is not true of the minimal polynomial.

It can be proved that if $\lambda$ is an eigenvalue of $A$ then $m(\lambda)=0$. This is reasonably clear: if $\vec v \ne 0$ is a $\lambda$-eigenvector of $A$ then $$m(\lambda) \vec v = m(A) \vec v = 0 \vec v = 0$$ and so $m(\lambda)=0$. The first equality here uses linearity and the fact that $A^n\vec v = \lambda^n \vec v$, which is an easy induction.

It can also be proved that $\chi(A)=0$. In particular that $m(t)\, |\, \chi(t)$.

So one example of when (1) occurs is when $A$ has $n$ distinct eigenvalues. If this is so then $m(t)$ has $n$ roots, so has degree $\ge n$; but it has degree $\le n$ because it divides $\chi(t)$. Thus they must be equal (since they're both monic, have the same roots and the same degree, and one divides the other).

A more complete characterisation of when (1) occurs (and when (2) occurs) can be gained by considering Jordan Normal Form; but I suspect that you've only just learnt about characteristic and minimal polynomials so I don't want to go into JNF.

Let me know if there's anything else you'd like to know; I no doubt missed some things out.


The minimal polynomial $m(t)$ is the smallest factor of the characteristic polynomial $f(t)$ such that if $A$ is the matrix, then we still have $m(A) = 0$. The only thing the characteristic polynomial measures is the algebraic multiplicity of an eigenvalue, whereas the minimal polynomial measures the size of the $A$-cycles that form the generalized eigenspaces (a.k.a. the size of the Jordan blocks). These facts can be summarized as follows.

  • If $f(t)$ has a factor $(t - \lambda)^k$, this means that the eigenvalue $\lambda$ has $k$ linearly independent generalized eigenvectors.
  • If $m(t)$ has a factor $(t - \lambda)^p$, this means that the largest $A$-cycle of generalized eigenvectors contains $p$ elements; that is, the largest Jordan block for $\lambda$ is $p \times p$. Notice that this means that $A$ is only diagonalizable if $m(t)$ has only simple roots.
  • Thus $f(t) = m(t)$ if and only if each eigenvalue $\lambda$ corresponds to a single Jordan block, a.k.a each eigenvalue corresponds to a single minimal invariant subspace of generalized eigenvectors.
  • $f(t)$ and $m(t)$ differ if any eigenvalue has more than one Jordan block, a.k.a. if an eigenvalue has more than one generalized eigenspace.