Counterexample to the chain-rule

The chain rule obviously applies, and since answers/comments seem to imply that it is ill-defined, I think this answer may be useful.

The problem with your application of the chain rule is that the derivative of $X \mapsto X^2$ at a matrix $X_0$ is not $H \mapsto 2X_0H$, but instead $H \mapsto X_0H+HX_0$.*

This yields the derivative of your $g(t)=f(t)^2$ (identifying the linear map $g':\mathbb{R} \to M_n(2,2)$ with its value at $1$, as is usually done) as

$$f(t)\cdot f'(t)+f'(t) \cdot f(t)$$ by the chain rule, which you can see that makes the computation check out.

*This is a simple computation, given by $$(X_0+H)^2=X_0^2+X_0H+HX_0+H^ 2=X_0^2+(X_0H+HX_0)+o(H),$$ and noting that the term in parenthesis is linear on $H$.

Trying to use a chain rules leads us into a morass of missing definitions (what does it mean to differentiate a function whose input is a matrix, for example), but we can see something of what happens when the use the product rule:

$$\frac{d}{dt} (f(t)g(t)) = f'(t)g(t) + f(t)g'(t) $$ Here we need to remember that we're talking about matrices, so the order of factors is important. For example, we cannot replace the $f(t)g'(t)$ term with $g'(t)f(t)$ and expect its value to stay the same.

For $f(t)^2$ this gives us $$ \frac{d}{dt} f(t)^2 = f'(t)f(t) + f(t)f'(t) $$ In contrast to the usual commutative case, the two terms cannot be combined into one. For your particular example we have $$ f(t) = \begin{pmatrix} 0 & e^{it} \\ e^{-it} & 0 \end{pmatrix} \qquad\qquad f'(t) = \begin{pmatrix} 0 & ie^{it} \\ -ie^{-it} & 0 \end{pmatrix} $$ And we get $$ f'(t)f(t) = \begin{pmatrix} i&0\\ 0 & -i \end{pmatrix} \qquad\qquad f(t)f'(t) = \begin{pmatrix} -i&0\\ 0 & i \end{pmatrix} $$ whose sum is clearly the zero matrix.

Counterexample to the chain-rule

Tags:

Calculus

Matrices

Real Analysis

Multivariable Calculus

Matrix Calculus

Related

Recent Posts