LU Decomposition vs. Cholesky Decomposition

Big question with a lot of possible tangents one could go down. I've tried to provide a somewhat brief summary.

In both an $LU$ factorization and a Cholesky $LL^T$ factorization, a square nonsingular matrix $A$ is factored into a product of triangular matrices.${}^*$ In a Cholesky factorization, the lower and upper triangular factors are transposes of each other—that is, an $LU$ factorization is a Cholesky factorization if $L^T=U$.

An example $LU$ factorization:

$$ \begin{pmatrix} 2 & 3 \\ 2 & 7 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 1 & 1\end{pmatrix}\begin{pmatrix} 2 & 3 \\ 0 & 4\end{pmatrix} $$

and a Cholesky factorization (which is also an $LU$ factorization)

$$ \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 1 & 1\end{pmatrix}\begin{pmatrix} 1 & 0 \\ 1 & 1\end{pmatrix}^T. $$

A matrix has a Cholesky factorization if, and only if, it is symmetric positive definite (SPD). If you try and compute a Cholesky factorization for matrix which is not SPD, it will always fail. However, for an SPD matrix, the Cholesky factorization is convenient in that one only has to store one triangular matrix $L$ rather a pair $L$ and $U$.

Once a matrix $A$ is factored as either an $LU$ or Cholesky factorization, then the linear system of equations $Ax=b$ can be solved by first solving $Ly=b$ by forward substitution and then $Ux=y$ (or $L^Tx=y$ for Cholesky) by backwards substitution. (The wikipedia page describes these quite nicely.) In either forward or backward substitution, the key insight is that, if you write out the equations for the unknowns $x_1,\ldots,x_n$ in the right order (forward for $L$ and backwards for $U$), each equation only possesses one new unknown than the previous equation, which can easily be computed.

If you need to compute $A^{-1}$, you can compute it from an $LU$ decomposition by the formula $A^{-1} = U^{-1}L^{-1}$. There exist methods to invert triangular matrices. In practical applications, it is widely accepted general wisdom to not compute a matrix inverse unless you have to, and there are usually ways around actually computing the inverse. (This, unfortunately, does not applying to linear algebra exams in school.)

When computing an $LU$ factorization, one may possibily encounter a zero pivot, which necessitates permuting the rows of the matrix to obtain a nonzero pivot. When pivoting is used, one actually computes a decomposition of the form $PA = LU$, where $P$ is a permutation matrix. When performing an $LU$ factorization on a computer using inexact arithmetic (such as floating point arithmetic), it is also important to permute when a pivot entry is simply small, not necessarily exactly zero. In fact, it is common to permute the matrix such that we always pick the largest pivot in the column, in a strategy known as partial pivoting. When performing Cholesky factorization on an SPD matrix, one will never encounter a zero pivot and one does not need to pivot to ensure the accuracy of the computation. (One may want to use permutations for other reasons, such as to maintain sparsity.)


${}^*$ One can do $LU$ factorization in a variety of ways for rectangular or rank deficient matrices as well, but the square case is the most common.