Understanding the syntax for derivatives - dy/dx

We write $\frac{dy}{dx}$ but this is just notational, convention really. First, it is important to remember that this is not a ratio (see this, which is an excellent discussion of $\frac{dy}{dx}$), it is a limit and there is a limit definition, see the brief section here for an idea. The idea is we approximate the change of functions using an ever closer secant line (that 'becomes' the tangent line in the limit). Typically the first definition one sees is, the derivative of $f(x)$, $f'(x)$, at $x=a$ is $$ f'(a)=\lim_{h \to 0} \frac{f(a+h)-f(a)}{(a+h)-a} $$ Now rather than write this limit each time we just 'shorthand' this via $f'(a)$. If we knew the derivative in general (for any $x$), we would write $f'(x)$ instead of the general limit because who wants to do that! (Especially, when we can find a formula for $f'(x)$!) Another way of writing $f'(x)$ is $$ f'(x)=\frac{df}{dx} $$ or the derivative of $f(x)$ with respect to $x$. When it comes to taking multiple derivatives, we use the Leibniz notation. That is, $\frac{dy}{dx}$ means the derivative of the function $y(x)$, with respect to $x$. Meaning, we examine how much $y$ (or $y(x)$) changes when we change $x$ by a little bit. What if we want to look at the change of the change? Well, notationally we write $$ \frac{d^2y}{dx^2}=\frac{d}{dx}\left(\frac{dy}{dx}\right) $$ In the previous notation, this would be $f''(x)$ and we just 'assume' we took the derivative with respect to $x$ since it is the only input variable for $f$. Notice the parenthesis term on the right should not be thought of as multiplication but rather as a function (strictly speaking this is more of an operator, in any case, we don't multiply these!). We find $\frac{dy}{dx}$ then we take the derivative of $\frac{dy}{dx}$ again. What if we wanted to do this process $100$ times!? Well, we can write that compactly as $$ \frac{d^{100}y}{dx^{100}} $$ It makes no sense to write $\frac{d^5y}{dx^4}$. Why? Because that's not how we defined the notation! Now why did we define it that way? Well, because we did!

Now we do we write the superscripts the way we do? Well, the number in the denominator, the superscript of the $d$, tells us how many times we have taken the derivative in total. The number above the $d$ in the denominator tells us how many times we have taken the derivative with respect to that variable. So in the above case, we have taken the derivative $100$ times and each of those $100$ times we looked at how $y$ changed when we changed $x$, that is we took the derivative with respect to $x$ $100$ times.

Now what if I had a function of $2$ variables, $y=f(x,z)$. I could look at how $y$ changes when I change $x$, $$ \frac{\partial^1 y}{\partial x^1} $$ (when we have more than one input variable, we use $\partial$ instead of $d$. Why? TRADITION!) I could have looked at how it changed with respect to $z$, $$ \frac{\partial^1 y}{\partial z^1} $$ Now what if I wanted to look at how this change changes when I change $x$, we take the derivative with respect to $x$!, so we write $$ \frac{\partial^2 y}{\partial x^1 \partial z^1} $$ Notice we have taken $2$ total derivatives, first with respect to $z$ then with respect to $x$. The top tells us how many times we took the derivative total and the bottom superscripts tell us how many times we took the derivative with respect to that variable. Of course, the numbers in the denominator need add to the one in the numerator! Of course, what we really mean is $$ \frac{\partial}{\partial x}\left(\frac{\partial^1 y}{\partial z^1} \right) $$ as above. But this short hand is easier, which is why we use it. Notice the order we took the derivative in works from left to right. Now then one may ask $$ \frac{\partial^2 y}{\partial z^1 \partial x^1}=\frac{\partial^2 y}{\partial x^1 \partial z^1} $$ the answer is no. Not all the time! But in 'most' of the 'nice' cases, yes (At least in the introductory undergraduate level!).

EDIT: With an answer this long, there is bound to be typos that I missed. If you are confused or think something is in error, please ask/let me know!

Also, as a shorthand for derivatives of functions of more than one variables, often over $$ \frac{\partial f}{\partial x} $$ one may write $f_x$. Then $$ \frac{\partial}{\partial x}\left(\frac{\partial f}{\partial z}\right)=\frac{\partial^2 f}{\partial x^1 \partial z^1} $$ would be written $f_{zx}$. Notice order in which we took the derivative still works from right to left.