How does the total derivative account for dependencies behind variables (intuitively)?
Consider the function $ F(x,y)$, this function geometrically denotes a surface. So, we have the input $ x-y$ plane and the points on it are 'raised up' by the function and given depth(z).
Parameterizing $y$ as a function of x (i.e: $y(x))$ is basically moving along a curve on the x-y plane. This curve when acted on by the function is sent to a curve on the surface. So, the expression $ \frac{dF}{dx}$ denotes the change in height as we change $x$ i.e: point on the curve in the input plane.
Geometrical idea behind parameterizing:
Graph of function : $$f(x,y)=x^2 y e^x$$
Plugging $y(x)=x^2 $ giving:
$$ f(x,y(x)) = x^2 e^x y(x)$$
The curve $ y=x^2$ has been raised up along the $z$ axis giving the surface. The intersection of the red surface and the blue curve is the portion of the surface given when you plug in the equation of curve as input to the surface's equation.
We can write $ \Delta F$ as:
$$ \Delta F = \frac{ \partial F}{\partial x} \Delta x + \frac{ \partial F}{\partial y} \Delta y$$
The idea here is that we approximate the surface locally using a tangent plane. In this tangent plane, we can approximate the change as we traverse the $ z-x$ plane as $ \frac{\partial F}{\partial x} \Delta x$ and the change as we traverse the $z-y$ plane as $ \frac{\partial F}{\partial y } \Delta y$ . These two changes add up as the change in the height of the surface due to change in the independent parameter add up linearly.
If we wanted to approximate the surface further by using paraboloid (including second-order variations) then there would be a 'cross-effect'. As in, change in one parameter influences the effect of nudging the other one.
Now what I'll do is I will explicitly put in the dependence of function in each term of the equation I had written:
$$ \Delta F(x) = \frac{ \partial F}{\partial x} \Delta x + \frac{ \partial F}{\partial y} \Delta y(x)$$
So note that I have written $ \Delta y(x)$ this is because when you change $x$ you also change height of curve in the input plane.
Now, what we can do is divide through by $ \Delta x$
$$ \frac{ \Delta F(x) }{\Delta x} = \frac{ \partial F}{\partial x} + \frac{ \partial F}{\partial y} \frac{\Delta y(x)}{\Delta x}$$
Taking the limit as $ \Delta x \to 0$
$$ F'(x) = \frac{ \partial F}{\partial x} + \frac{ \partial F}{\partial y} y'$$
Which is exactly what the total derivative is telling us.
For functions dependent on more number of variables, we would have to just have to account for changes in F due to the other variables to fix generalize the first equation which I had written.
Let's say $y=t(x)$ and $g(x) = f(x, y)=f(x, t(x))$. We are interested in computing $g'(x_0)$ which intuitively is the change in value of $g$ when we make infinitesimal change in $x$. Well for $f(x, y)$ it means a bit different as when we make infinitesimal change in $x$. Change in $y$ may be different. Precisely, its multiplied by the factor $t'(x)$. So the total derivative of $f$ looks like: $\frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}dy = \frac{\partial f}{\partial x}dx + \frac{\partial f}{\partial y}t'(x)dx$. Hence, we get the same answer.
Mathematically: \begin{align} g'\left(x_0\right) &= \lim_{h\to 0} \frac{g\left(x_0 + h\right) - g\left(x_0\right)}{h} \\ &= \lim_{h\to 0} \frac{f\left(x_0 + h, t\left(x_0 + h\right)\right) - f\left(x_0, t\left(x_0\right)\right)}{h} \\ &= \lim_{h\to 0} \frac{f\left(x_0 + h, t\left(x_0\right) + ht'\left(x_0\right)\right) - f\left(x_0, t\left(x_0\right)\right)}{h} \\ &= \lim_{h\to 0} \frac{f\left(x_0 + h, y_0 + ht'\left(x_0\right)\right) - f\left(x_0, t\left(x_0\right)\right)}{h} \\ &= \lim_{h\to 0} \frac{\left(\frac{\partial f}{\partial x}\right)_{x_0}h + \left(\frac{\partial f}{\partial y}\right)_{y_0}ht'(x_0)}{h} \\ \\ &= \left(\frac{\partial f}{\partial x}\right)_{x_0} + \left(\frac{\partial f}{\partial y}\right)_{y_0}t'(x_0) \end{align}