Why does the $L_2$ norm give the shortest path between 2 points?
If we want a sense of localness (or calculus to work), we'd like to be able to obtain the length by adding up the length from pieces of the path (for example using a ruler, or counting paces as we walk along the path between two points).
However, even considering just two dimensions we see something interesting for $L_p$.
$$\left(|x|^p + |y|^p\right)^{1/p} = \sum_{i=1}^N \left(\left|\frac{x}{N}\right|^p + \left|\frac{y}{N}\right|^p\right)^{1/p}$$
This trivially works with $p=1$, and due to a special symmetry at $p=2$ it works there as well. This will not work for other $p\neq 0$ (I am unsure of how to extend the definition to check $p=0$).
The special symmetry at $p=2$ is that the distance measurement becomes rotationally invariant. So the seemingly mundane reasons of
- space has more than one dimension
- locality
- uniformity
seem to already select $L_2$ as special. Any other choice would give a preferred coordinate system, and possibly break locality.
So what would a different universe in which $L_1$ or something else is the natural choice? If you imagined an N dimensional Cartesian lattice world, so one with discrete lengths, and a clearly preferred coordinate basis, this would make $L_1$ a more natural choice.
I'm not sure of a good picture for a universe in which $L_p, p>2$ would be a natural choice. There would be preferred directions, and you could only consider an object as a whole (not in parts), which seems to suggest in such a hypothetical universe you couldn't even experience your life as a sequence of moments (which I guess would make sense if we have highly non-local physics and therefore causality is out the window).
Interesting question.