Is the shorthand $ \partial_{\mu} $ strictly a partial derivative in field theory?

  1. No, one of the partial derivative symbol $\partial_{\mu}$ in OP's equation (2) is not correct if it is supposed to mean partial derivatives. The correct Euler-Lagrange (EL) equations read $$ \tag{2'} 0~\approx~\frac{\delta S}{\delta \phi^{\alpha}} ~=~\frac{\partial {\cal L}}{\partial \phi^{\alpha}} - \sum_{\mu} \color{Red}{\frac{ d}{dx^{\mu}}} \frac{\partial {\cal L}}{\partial (\partial_{\mu}\phi^{\alpha})} + \ldots,$$ where the $\approx$ symbol means equality modulo eoms, and the ellipsis $\ldots$ denotes possible higher derivative terms. Here $$ \color{Red}{\frac{ d}{dx^{\mu}}}~=~ \frac{\partial }{\partial x^{\mu}} +\sum_{\alpha}(\partial_{\mu}\phi^{\alpha})\frac{\partial }{\partial \phi^{\alpha}} + \sum_{\alpha, \nu} (\partial_{\mu}\partial_{\nu}\phi^{\alpha})\frac{\partial }{\partial (\partial_{\nu}\phi^{\alpha})} + \ldots $$ is the $\color{Red}{\text{total spacetime derivative}}$ rather than a partial spacetime derivative. See also this and this related Phys.SE posts.

  2. Let us mention for completeness that the other appearance of the partial derivative symbol $\partial_{\mu}$ in OP's equation (2) is correct. It may be replaced with a total spacetime derivative $\color{Red}{d_{\mu}}$, since $\partial_{\mu}\phi\equiv\color{Red}{d_{\mu}}\phi$ by definition, cf. OP's eq. (3).


First, let's make sure we understand the notion of the total derivative in the particle case: The Lagrangian itself is a real-valued function $L(q,\dot{q},t)$, where $q$ and $\dot{q}$ are treated as independent variables, cf. this question or this answer of mine. When we speak of a "total" derivative in the context of the Euler-Lagrangian equations, we actually mean that we take a path $q(t)$, compute its time-derivative $\dot{q}(t)$, then consider the function $L(q(t),\dot{q}(t),t)$, whose only free arugment is now $t$, and then take the derivative w.r.t. $t$. To speak of "total" or "partial" derivative is a handwavy way to distinguish between the Lagrangian as a function of independent variables $q,\dot{q},t$ (this is a partial case) and the Lagrangian as a function of time after a time-dependent path has been plugged in (this is the "total" case). So, the expression $\frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial \dot{q}}$ means: Take the Lagrangian as a function of $q,\dot{q},t$, differentiate with respect to $\dot{q}$, then plug a path $q(t)$ into the resulting function, then differentiate with respect to $t$.

So, in the field case, we have a function $\mathcal{L}(\phi,\partial_\mu\phi,x)$ that just treats $\phi$ and $\partial_\mu \phi$ as real numbers, and of which we take the "partial" derivatives $\frac{\partial\mathcal{L}}{\partial(\partial_\mu \phi)}$. This is just the derivative of this function with respect to its second argument, nothing special, just like in the particle case. Now, once again, you can plug in a field $\phi(x)$ into this function, and you get a function $\mathcal{L}(\phi(x),\partial_\mu \phi(x),x)$ that is now just a function of $x$, and you can differentiate this object. Like the $\frac{\mathrm{d}}{\mathrm{d}t}$ in the particle case, the $\partial_\mu$ in the field version of the Euler-Lagrange equation is supposed to act in this way: You differentiate the function $\mathcal{L}$ with respect to its second argument, then plug in a field $\phi(x)$, then differentiate the resulting function w.r.t. $x^\mu$ - so the derivative is indeed a "total" one.