Why do we write $(v\cdot \nabla) v$ instead of $v \cdot (\nabla v)$ for $v_j \frac{\partial}{\partial x_j} v_i$ in the material derivative?
The reason is that $\mathbf v\cdot (\nabla \mathbf v)$ is ambiguous:
- it could mean $\displaystyle v_i\frac{\partial v_j}{\partial x_i}$
- ...but it could also mean $\displaystyle v_i\frac{\partial v_i}{\partial x_j}$.
In other words, $\nabla\mathbf v$ is an asymmetric tensor and the notation $\mathbf v\cdot (\nabla \mathbf v)$ does not specify which of the two tensor sectors gets addressed by the contraction. Placing the contracting tensor on the left makes it a good bet that it contracts with the derivative (and this can be done so long as you specify your notation up front), but it's not completely unambiguous the way that $(\mathbf v\cdot\nabla)\mathbf v$ is.
Similarly, the way I see it, the notation $(\mathbf v\cdot\nabla)$ makes immediately obvious a message along the lines of "this isn't really a full gradient, it's just the directional derivative along $\mathbf v$ that's in play here", which is often precisely the message that needs to be transmitted.
The intuition for this statement is actually rather simple. While both statements are notationally the same, I personally think that $\left(\textbf{v}\cdot\nabla\right)\textbf{v}$ is a lot more intuitive and holds more physical meaning.
Let's consider viewing the world from the perspective of a particle in your flow. Then if I define a function $f(\textbf{x})$ (which can be any field -- scalar, vector, or tensor), I will measure that the tensor field changes in time with (assuming that $\partial_tf(\textbf{x})=0$)
$$\frac{\mathrm{d}f}{\mathrm{d}t}=\left(\textbf{v}\cdot\nabla\right)f(\textbf{x})$$
This is simply the chain rule. More intuitively, we see that the change of the function along the particle flow is just the directional derivative along $\textbf{v}$. Similarly, if we want to define the acceleration of a body along the flow lines, we take the effective time derivative of $\textbf{v}$, which gives us the result you claimed seemed unmeaningful. In fact, this operator is probably the most physically meaningful differential operator in the problem.
I hope this helped!