Find the square root of a matrix

This is an expansion of Arturo's comment.

The matrix has eigenvalues $50,25$, and eigenvectors $(4,3),(-3,4)$, so it eigendecomposes to $$A=\begin{pmatrix}4 & -3 \\ 3 & 4\end{pmatrix} \begin{pmatrix}50 & 0 \\ 0 & 25\end{pmatrix} \begin{pmatrix}4 & -3 \\ 3 & 4\end{pmatrix}^{-1}.$$

This is of the form $A=Q\Lambda Q^{-1}$. If this is $B^2$, then there will be a $B$ of the form $Q\Lambda^{1/2} Q^{-1}$ (square this to check this is formally true). A square root of a diagonal matrix is just the square roots of the diagonal entries, so we have

$$B=\begin{pmatrix}4 & -3 \\ 3 & 4\end{pmatrix} \begin{pmatrix}\sqrt{50} & 0 \\ 0 & \sqrt{25}\end{pmatrix} \begin{pmatrix}4 & -3 \\ 3 & 4\end{pmatrix}^{-1}$$

$$=\frac{1}{5}\begin{pmatrix}9+16\sqrt{2} & -12+12\sqrt{2} \\ -12+12\sqrt{2} & 16+9\sqrt{2}\end{pmatrix}.$$

Here we used $\sqrt{50}=5\sqrt{2},\sqrt{25}=5$, and a quick formula for the inverse of a $2\times 2$ matrix:

$$\begin{pmatrix}a&b\\c&d\end{pmatrix}^{-1}=\frac{1}{ad-bc}\begin{pmatrix}d&-b\\-c&a\end{pmatrix}.$$

Keep in mind that matrix square roots are not unique (even up to sign), but this particular method is guaranteed to produce one example of a real matrix square root whenever $A$ has all positive eigenvalues.


Finding an upper triangular $U$ such that $A=U^TU$ is even more straightforward:

$$A=\begin{pmatrix} a&0 \\ b&c \end{pmatrix} \cdot \begin{pmatrix} a&b \\ 0&c \end{pmatrix} $$

This is $a^2=41$ hence $a=\sqrt{41}$, $ab=12$ hence $b=\frac{12}{41}\sqrt{41}$, and $b^2+c^2=34$ hence $c=25\sqrt{\frac{2}{41}}$.

In other words,

$$U=\sqrt{41}\begin{pmatrix}1&\frac{12}{41}\\0&\frac{25}{41}\sqrt{2}\end{pmatrix}. $$


For the first part of your question, here is a solution that only works for 2-by-2 matrices, but it has the merit that no eigenvalue is needed.

Recall that in the two-dimensional case, there is a magic equation that is useful in many situations. It is $X^2-({\rm tr}X)X+(\det X)I=0$, which arises from the characteristic polynomial of a $2\times2$ matrix $X$. Now, if $X^2=A$, we have $\det X=\pm\sqrt{\det A}=r$ (say). We take the positive value for $r$. Hence $$ (\ast):\quad ({\rm tr}X)X=X^2+rI=A+rI $$ and $({\rm tr}X)^2 = {\rm tr}\left(({\rm tr}X)X\right) = {\rm tr}(A+rI) = {\rm tr}A + 2r$. Thus, from $(\ast)$ we obtain $$ X = \frac{1}{\sqrt{{\rm tr}A + 2r}}(A+rI)\quad {\rm where}\quad r=\sqrt{\det A}. $$ This method works for all 2-by-2 matrices $A$ when $\det A\ge0$ and ${\rm tr}A + 2\sqrt{\det A}>0$. In particular, it works for positive definite $A$.

For the second part of your question, as the others have pointed out, the decomposition you ask for is a Cholesky decomposition.


Let $\lambda$ and $\mu$ be the eigenvalues of your $2$ by $2$ real matrix $A$. (We may have $\lambda=\mu$.) Assume that $\lambda$ and $\mu$ are positive.

If $\lambda\not=\mu$, write the equation of the secant line to the curve $y=\sqrt x$ through the points $(\lambda,\sqrt\lambda)$ and $(\mu,\sqrt\mu)$: $$y=\sqrt\lambda\ \ \frac{x-\mu}{\lambda-\mu}+\sqrt\mu\ \ \frac{x-\lambda}{\mu-\lambda}\quad.$$ The matrix you want is $$\sqrt\lambda\ \ \frac{A-\mu I}{\lambda-\mu}+\sqrt\mu\ \ \frac{A-\lambda I}{\mu-\lambda}\quad,$$ where $I$ is the identity matrix.

If $\lambda=\mu$, write the equation of the tangent line to the curve $y=\sqrt x$ through the point $(\lambda,\sqrt\lambda)$: $$y=\sqrt\lambda+\frac{x-\lambda}{2\sqrt\lambda}\quad.$$ The matrix you want is $$\sqrt\lambda\ I+\frac{A-\lambda I}{2\sqrt\lambda}\quad.$$

Do you see why?

Do you see how to generalize this to $n$ by $n$ matrices?

EDIT 4. This is to just explain why this secant/tangent stuff comes into the picture. Assume to simplify that the eigenvalues $\lambda$ and $\mu$ of your two by two real matrix $A$ are real and distinct. Let $f\in\mathbb R[X]$ be a polynomial, and $s$ the unique polynomial of degree $\le1$ which agrees with $f$ at $\lambda$ and $\mu$. [Graphically, this is a secant line.] Then the characteristic polynomial $$\chi=(X-\lambda)(X-\mu)$$ will divide $f-s$. As $\chi(A)=0$ by the Cayley-Hamilton Theorem, we have $f(A)=s(A)$. But the expression $s(A)$ makes sense whenever $f$ is a (real-valued) function defined at $\lambda$ and $\mu$. Moreover, the map $f\mapsto f(A)$ is compatible with addition and multiplication.

EDIT 1. As noticed by @Did and @user1551, there is a cute formula for the "generalized secant line" to the curve $y=\sqrt x$, by which I mean: the secant line if the points are distinct, the tangent line if they coincide. Supposing $\lambda\not=\mu$, the equation of the secant line is $$y=\frac{\sqrt\lambda-\sqrt\mu}{\lambda-\mu}\ \ x+ \frac{\mu\sqrt\lambda-\lambda\sqrt\mu}{\lambda-\mu}= \frac{x+\sqrt{\lambda\mu}}{\sqrt\lambda+\sqrt\mu}\quad,$$ and the miracle is that the last expression makes sense even if $\lambda=\mu$.

EDIT 2. Note that there are other solutions when $\lambda\not=\mu$. Putting $$E:=\frac{A-\lambda I}{\mu-\lambda}\quad,\quad F:=\frac{A-\mu I}{\lambda-\mu}\quad,$$ we get $$E^2=E,\ F^2=F,\ EF=FE=0,\ I=E+F,\ A=\mu E+\lambda F,$$ and thus $$(\pm\sqrt\mu\ E\pm\sqrt\lambda\ F)^2=A$$ for the four choices of signs. [The plus plus choice corresponds to the previous formula.]

EDIT 3. Here is a generalization.

Let $T$ be an $n$ by $n$ complex matrix, and $$p(X)=(X-\lambda_1)^{m(1)}\cdots(X-\lambda_k)^{m(k)}$$ its minimal polynomial (the $\lambda_i$ being distinct and the $m(i)$ positive). Let $A$ be the algebra of those functions $f(z)$ which are holomorphic in a neighborhood of the spectrum $\{ \lambda_1,\dots,\lambda_k \}$ of $T$.

There is a unique $\mathbb C[X]$-algebra morphism from $A$ to $\mathbb C[T]=\mathbb C[X]/(p(X))$. Denote this morphism by $f(z)\mapsto f(T)$. If $f(z)$ is in $A$, then the unique representative of $f(T)$ in $\mathbb C[X]$ of degree less than $\deg p(X)$ is $$\sum_{i=1}^k\ \ \underset{X=\lambda_i}\heartsuit\left( \Big(\ \underset{z=\lambda_i}\heartsuit f(z) \Big)\ \ \frac{(X-\lambda_i)^{m(i)}}{p(X)}\ \right)\ \frac{p(X)}{(X-\lambda_i)^{m(i)}}$$

with $$\underset{u=\lambda_i}\heartsuit\varphi(u):=\sum_{j=0}^{m(i)-1}\frac{\varphi^{(j)}(\lambda_i)}{j!}\ (X-\lambda_i)^j.$$

Moreover, the $\lambda_i$-generalized eigenspace of $T$ is contained in the $f(\lambda_i)$-generalized eigenspace of $f(T)$.

All this follows from the Chinese Remainder Theorem, which says $$\frac{\mathbb C[X]}{(p(X))}=\prod_{i=1}^k\ \ \frac{\mathbb C[X]}{(X-\lambda_i)^{m(i)}}\quad,$$ and from the Taylor Formula.

[There is an Edit 4 above.]