Why is the following NOT a proof of The Chain Rule?

The main problem is that you could have $\Delta u = 0$ for some $x$, which makes $\dfrac{\Delta y}{\Delta u}$ undefined.


The proof given in question is almost rigorous and correct but written in reverse (as pointed out by Wazul). Moreover this is a very standard proof of the chain rule. Contrary to what many students think this proof is not based on infinitesimals. But we do need to add some details.


Let $u = g(x)$ be differentiable and $y = f(u)$ be also differentiable. Then $y = f(g(x)) = (f \circ g)(x)$ and the Chain Rule says that $$\frac{dy}{dx} = \frac{dy}{du}\cdot\frac{du}{dx}$$ or $$(f\circ g)'(x) = f'(u)g'(x) = f'(g(x))g'(x)$$ The proof in the question needs to be supplemented with proper definitions of $\Delta u, \Delta y$.

We have $\Delta u = g(x + \Delta x) - g(x)$ and $$\frac{du}{dx} = g'(x) = \lim_{\Delta x \to 0}\frac{g(x + \Delta x) - g(x)}{\Delta x} = \lim_{\Delta x \to 0}\frac{\Delta u}{\Delta x}$$ And similarly with $\Delta y = f(u + \Delta u) - f(u)$ we have $$\frac{dy}{du} = f'(u) = \lim_{\Delta u \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u} = \lim_{\Delta u \to 0}\frac{\Delta y}{\Delta u}$$ We then have \begin{align} \frac{dy}{dx} &= (f\circ g)'(x) = \lim_{\Delta x \to 0}\frac{(f \circ g)(x + \Delta x) - (f \circ g)(x)}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(g(x + \Delta x)) - f(g(x))}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(g(x) + \Delta u) - f(g(x))}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\frac{\Delta u}{\Delta x}\text{ (assume }\Delta u \neq 0)\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\frac{g(x + \Delta x) - g(x)}{\Delta x}\notag\\ &= \lim_{\Delta u \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\lim_{\Delta x \to 0}\frac{g(x + \Delta x) - g(x)}{\Delta x}\notag\\ &= f'(u)g'(x)\notag\\ &= \frac{dy}{du}\cdot\frac{du}{dx} \end{align} We have assumed in the above that $\Delta u \neq 0$ when $\Delta x \to 0$. Also by continuity of $u = g(x)$ (note that differentiability implies continuity) we have $\Delta u \to 0$ as $\Delta x \to 0$.


The above argument fails when $\Delta u = g(x + \Delta x) - g(x)$ vanishes for infinitely many values of $\Delta x$ as $\Delta x \to 0$. In this case we we have $du/dx = g'(x) = 0$. Why? Because if $g'(x) \neq 0$ then the ratio $\Delta u / \Delta x \neq 0$ for all small values of $\Delta x$ and hence $\Delta u \neq 0$ for all small values of $\Delta x$.

Hence if $\Delta u = 0$ for infinitely many small values of $\Delta x$ then $du/dx = g'(x) = 0$. We show that in this case $dy/dx = 0$. Clearly for those values of $\Delta x$ for which $\Delta u = 0$ we also have $\Delta y = f(u + \Delta u) - f(u) = 0$ so that the ratio $\Delta y/\Delta x = 0$. For values of $\Delta x$ where $\Delta u \neq 0$ we know that $\Delta y/\Delta u$ is bounded (because the derivative $dy/du$ exists) and the ratio $\Delta u/\Delta x$ can be made arbitrarily small (because its limit is $du/dx = 0$). Hence the overall product $$\frac{\Delta y}{\Delta x} = \frac{\Delta y}{\Delta u}\cdot\frac{\Delta u}{\Delta x}$$ can be made arbitrarily small by choosing $\Delta x$ sufficiently small. It follows that $\Delta y / \Delta x$ can be made arbitrary small for all sufficiently small values of $\Delta x$. It follows that $$\frac{dy}{dx} = \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta x} = 0$$ and therefore the chain rule holds true in this case also.

Note that most common textbooks of calculus omit the discussion of the case when $\Delta u = 0$. Also it is much better to type and understand if $\Delta x$ is replaced by $h$ and $\Delta u$ is replace by $k$. However I have tried to stick to the notation used by OP.


If $\Delta u=0$ then you have a $0$ in a denominator.

That's not a problem if it happens only when $|\Delta x|>0.000000000001$ since the limit depends only on what happens when $|\Delta x|$ is less than that. And similarly with any other positive number in place of $0.000000000001$.

But now suppose it happens when $|\Delta x|=0.000000000001$ and again when $|\Delta x| =0.000000000001^2$ and again when $|\Delta x|=0.000000000001^3$ and so on, ad infinitum. Then it's a difficulty to be addressed. And what if $\Delta u=0$ for all $\Delta x$ between $\pm 0.001$? Then your proof clearly won't work.

Hence one writes $$ \frac{\Delta y}{\Delta x} = \left.\begin{cases} \Delta y/\Delta u & \text{if } \Delta u\ne0, \\[6pt] dy/du & \text{if }\Delta u = 0, \end{cases} \right\} \cdot \frac{\Delta u}{\Delta x} $$ and one goes on from there. The $\displaystyle \left\{ \begin{array}{c} \text{factor in} \\ \text{braces} \end{array} \right\}$ approaches $dy/du$ as $\Delta x\to 0$ and the second factor, $\dfrac{\Delta u}{\Delta x}$, approaches $du/dx$.