Proof for the total derivative of a function
In computing the partial derivative $\frac{\partial f}{\partial x_i}$ of a function $f(x_1,\dots,x_n)$ w.r.t. to a variable $x_i$, one assumes that the other variables do not vary in the computational process. This is a consequence of the very definition of partial derivative. More generally, if the variables $x_2,\dots,x_n$ are functions of the variable $x_1$, then the partial derivative $\frac{\partial f}{\partial x_1}$ does not catch the overall change of $f$ while $x_1$ varies.
One needs to introduce another measure of such change, i.e. the total derivative
$$\frac{df}{dx_1}:=\frac{\partial f}{\partial x_1}+\sum_{i=2}^n \frac{\partial f}{\partial x_i}\frac{d x_i}{d x_1}.$$
From its definition (this is the point: I take it as a definition, although you can prove it using the chain rule on $f(x_1,x_2(x_1),\dots,x_n(x_1)))$ it is clear that the total derivative takes into account of all changes when the variable $x_1$ varies.
To arrive at a geometrical interpretation of the total derivative, let us multiply both sides of the above expression with the infinitesimal increment "$dx_1$", arriving at
$$df(x):=\frac{df}{dx_1}dx_1=\frac{\partial f}{\partial x_1}dx_1+\sum_{i=2}^n \frac{\partial f}{\partial x_i} d x_i~~(*).$$
We can interpret $df(x)$ in $(*)$ as the total differential of $f$ at $x=(x_1,\dots,x_n)$ as
$$df(x)=\frac{\partial f}{\partial x_1}dx_1+\sum_{i=2}^n \frac{\partial f}{\partial x_i} d x_i=\sum_{i=1}^n \frac{\partial f}{\partial x_i} d x_i=\langle \nabla f(x), dx \rangle $$
where $dx=(dx_1,\dots,dx_n)$ represents an infinitesimal increment.
Try starting with f(x+Δx ,y+Δy) - f(x,y) = f(x+Δx, y+∆y) - f(x+Δx, y) + f(x+Δx, y) - f(x, y).
Reference https://www.physicsforums.com/threads/proof-of-the-total-differential-of-f-x-y.467858/