Why do (can) we impose local gauge invariance?
I'm with you. I don't want to be unprofessional, but I find the whole "breaking causality" thing to be complete bogus. I see absolutely no way that the humble Klein Gordon field "breaks causality." In my opinion, just ignore it.
"Why" we consider gauge invariant theories is a good question, and there are many answers. I will approach the question from only one possible direction. When you couple a gauge field to another field, the gauge field must necessarily be coupled to a conserved current. This is a manifestation of Noether's second theorem. That is, if you want your field to be sourced by a conserved current, using a field with gauge invariance is the easiest way to do it.
Let's restrict our discussion by just talking about the simplest possible gauge field: the free electromangentic field, described the by vector potential $A^\mu$. The Lagrangian is $$ \mathcal{L} = -\frac{1}{4} F^{\mu \nu} F_{\mu \nu}. $$ Now we all know that this Lagrangian is invariant if we substitute $$ A_\mu \to A_\mu + \varepsilon \partial_\mu \Lambda $$ for some constant parameter $\varepsilon$. This is just our gauge invariance. In other words, for constant epsilon, $$ \delta \mathcal{L} = 0. $$ Let us now carry out "Noether procedure," in the slick way I like to do it. Let's now make $\varepsilon$ time dependent. That is, $\varepsilon = \varepsilon(t)$. We are still keeping it very small, so second order terms will not matter. Under this variation, you can pull out a piece of paper and find $$ \delta F_{\mu \nu} = \partial_\mu \varepsilon \partial_\nu \Lambda - \partial_\nu \varepsilon \partial_\mu \Lambda $$ and $$ \delta \mathcal{L} = -\frac{1}{2} F^{\mu \nu}(\partial_\mu \varepsilon \partial_\nu \Lambda - \partial_\nu \varepsilon \partial_\mu \Lambda). $$ Performing an integration by parts in our unwritten integral and using the antisymmetry of $F^{\mu \nu}$, this becomes $$ \delta \mathcal{L} = \varepsilon \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda). $$ On solution to the equations of motion, $\delta S$ must be $0$. This is just the principle of least action-- any tiny variation must keep the action stationary. Imposing boundary conditions on $\varepsilon$, we can see that on solutions to the equations of motion, $$ \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0. $$ In other words, $F^{\mu \nu} \partial_\nu \Lambda$ is a conserved current.
What I have just showed you is simply Noether's first theorem, albeit presented in a somewhat different way than usual. Interestingly, we have found that for any function $\Lambda$ on spacetime, we have a conserved current! We have found infinitely many conserved currents!
Why does no one ever talk about this? Well, it's not the best way to see what's going on, and you'll see why in a second.
Because $\partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0$, we trivially have
$$ \int d^4 x \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) = 0. $$ on solutions to the equations of motion. Furthermore, we have $F^{\mu \nu} \partial_\mu \partial_\nu \Lambda = 0$ because $F^{\mu \nu}$ is anti symmetric and $\partial_\mu \partial_\nu \Lambda$ is symmetric. Therefore, $$ \int d^4 x (\partial_\nu \Lambda ) (\partial_\mu F^{\mu \nu} )= 0. $$ Integrating by parts, we have $$ -\int d^4 x \Lambda \partial_\mu \partial_\nu F^{\mu \nu} = 0. $$ Let's now duplicate the trick of Noether's first theorem. But instead of thinking about varying $\varepsilon$, let's think about varying $\Lambda$! Because the above integral must be $0$ for any $\Lambda$, we have $$ \partial_\mu \partial_\nu F^{\mu \nu} = 0. $$ This is the conservation equation for the electro magnetic field. Furthermore, it is an example of Noether's second theorem, which we have seen is like "Noether's theorem twice."
You may object that deriving $\partial_\mu \partial_\nu F^{\mu \nu}$ is not very impressive. (It follows directly from the anti symmetry of $F^{\mu \nu}$, and is called the "Bianchi identity.") The impressive part will be the next part.
Let's say we want to couple our gauge field to some source current $J^\mu$. $$ \mathcal{L} = -\frac{1}{4} F^{\mu \nu} F_{\mu \nu} + J^\mu A_\mu $$ Let us now suppose that our Lagrangian has the same gauge symmetry as before. I will now show that in that case, $J^\mu$ must be a conserved current.
Varying $$ A_\mu \to A_\mu + \varepsilon \partial_\mu \Lambda $$ we now have $$ \delta \mathcal{L} = \varepsilon \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + \varepsilon J^\mu \partial_\mu \Lambda $$ which implies that on solutions to the equation of motion, $$ \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + J^\mu \partial_\mu \Lambda = 0. $$ Integrating $0$ over all of space time, we have $$ \int d^4 x \big( \partial_\mu(F^{\mu \nu} \partial_\nu \Lambda) + J^\mu \partial_\mu \Lambda \big) = 0. $$ Integrating by parts, $$ -\int d^4 x \Lambda \big( \partial_\mu \partial_\nu F^{\mu \nu} + \partial_\mu J^\mu \big) = 0. $$ As $\partial_\mu \partial_\nu F^{\mu \nu} = 0$ is trivially true, and as the above equation is true for all $\Lambda$, we see that $$ \partial_\mu J^\mu = 0. $$ So if we want our gauge field to be be sourced by a current, if we make sure that our full Lagrangian is gauge invariant, our current will necessarily be conserved!
So let's take a step back. In general, coupling fields to conserved currents is a desirable thing to do. In the example above, the electromagnetic field was sourced by electric current, although in other examples the current can be less familiar.
If you ask the question, "I want to couple a field to a conserved current, how exactly can I do that?" I have now shown you that if you ensure your field has a gauge symmetry, you cannot fail. This is perhaps the easiest way to see "why" gauge fields are desirable. Furthermore, this arguments takes the element of "cleverness" out of constructing your Lagrangian. A global $U(1)$ charge symmetry gives you a conserved current via Noether's first theorem. If you then want to put that conserved current to work by coupling it to another field (making it interact) while still keeping it conserved, then promoting your global symmetry to a local symmetry will do just the trick!