Why can't any term which is added to the Lagrangian be written as a total derivative (or divergence)?
Let us for simplicity consider just classical point mechanics (i.e. a $0+1$ dimensional world volume) with only one variable $q(t)$. (The generalization to classical field theory on an $n+1$ dimensional world volume with several fields is straightforward.)
Let us reformulate the title(v1) as follows:
Why can't the Lagrangian $L$ always be written as a total derivative $\frac{dF}{dt}$?
In short, it is because:
In physics, the action functional $S[q]$ should be local, i.e. of the form $S[q]=\int dt~L$, where the $L$ is a function of the form $$L~=~L(q(t), \dot{q}(t), \ddot{q}(t), \ldots, \frac{d^Nq(t)}{dt^N};t),$$ and where $N\in\mathbb{N}_{0}$ is some finite order. (In most physics applications $N=1$, but this is not important in what follows. Note that the Euler-Lagrange equations get modified with higher-order terms if $N>1$.)
Similarly, we demand that $F$ is of local form $$F~=~F(q(t), \dot{q}(t), \ddot{q}(t), \ldots, \frac{d^{N-1}q(t)}{dt^{N-1}};t),$$ We stress that $L$ and $F$ only refer to the same time instant $t$. In other words, if $t$ is now, then $L$ and $F$ does not depend on the past nor the future.
The special intermediate role played by the $q$ variable in between $L$ and $t$. Note that there can be both implicit and explicit time-dependence of $L$ and $F$.
Counterexample: Consider
$$L~=~-\frac{k}{2}q(t)^2.$$ Then we can write $L=\frac{dF}{dt}$ as a total time derivative by defining
$$F=-\frac{k}{2}\int_0^t dt'~q(t')^2. $$
($F$ is unique up to a functional K[q] that doesn't depend on $t$.) But $F$ is not on local form as it also depends on the past $t'<t$.
Finally, let us mention that one can prove (still under assumption of locality in above sense plus assuming that the configuration space is contractible, due to an algebraic Poincare lemma of the so-called bi-variational complex, see e.g. Ref. 1) that
$$ \text{The Lagrangian density is a total divergence} $$ $$\Updownarrow$$ $$\text{The Euler-Lagrange equations are identically satisfied}. $$
References:
- G. Barnich, F. Brandt and M. Henneaux, Local BRST cohomology in gauge theories, Phys. Rep. 338 (2000) 439, arXiv:hep-th/0002245.
Lagrangian is a functional of time, generalized coordinates, and time derivative of generalized coordinates. Obviously many scalars are not total time derivatives; $q^2$ for example.
As for Lagrangian density, keep in mind that it is the functional of field variables $\phi_i(x^\mu)$ and their derivatives $\partial_\mu \phi_i(x^\mu)$. It is not a composite function of coordinates $\boldsymbol{x^\mu}$. So the arbitrary scalar function, in the form of a divergence, indeed does not matter, because the function is of the coordinates, which is independent from the field variables.
Proof that $q^2$ cannot be rewritten as total time derivative:
The total time derivative of any function $$F(q,\dot q,t)=\frac{\mathrm{d}}{\mathrm{d}t}f(q,t)=\frac{\partial f}{\partial q}\dot{q}+\frac{\partial f}{\partial t},$$
automatically satisfies Euler-Lagrangian equation (easy to prove by substitution) $$\frac{\partial F}{\partial q}-\frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial F}{\partial \dot{q}}=0.$$
$q^2$ does not satisfy the above condition, so it can't be written as total time derivative.