Probability amplitude in Layman's Terms
Part of you problem is
"Probability amplitude is the square root of the probability [...]"
The amplitude is a complex number whose amplitude is the probability. That is $\psi^* \psi = P$ where the asterisk superscript means the complex conjugate.1 It may seem a little pedantic to make this distinction because so far the "complex phase" of the amplitudes has no effect on the observables at all: we could always rotate any given amplitude onto the positive real line and then "the square root" would be fine.
But we can't guarantee to be able to rotate more than one amplitude that way at the same time.
More over, there are two ways to combine amplitudes to find probabilities for observation of combined events.
When the final states are distinguishable you add probabilities: $P_{dis} = P_1 + P_2 = \psi_1^* \psi_1 + \psi_2^* \psi_2$.
When the final state are indistinguishable,2 you add amplitudes: $\Psi_{1,2} = \psi_1 + \psi_2$, and $P_{ind} = \Psi_{1,2}^*\Psi_{1,2} = \psi_1^*\psi_1 + \psi_1^*\psi_2 + \psi_2^*\psi_1 + \psi_2^*\psi_2$. The terms that mix the amplitudes labeled 1 and 2 are the "interference terms". The interference terms are why we can't ignore the complex nature of the amplitudes and they cause many kinds of quantum weirdness.
1 Here I'm using a notation reminiscent of a Schrödinger-like formulation, but that interpretation is not required. Just accept $\psi$ as a complex number representing the amplitude for some observation.
2 This is not precise, the states need to be "coherent", but you don't want to hear about that today.
Before trying to understand quantum mechanics proper, I think it's helpful to try to understand the general idea of its statistics and probability.
There are basically two kinds of mathematical systems that can yield a nontrivial formalism for probability. One is the kind we're familiar with from everyday life: each outcome has a probability, and those probabilities directly add up to 100%. A coin has two sides, each with 50% probability. $50\% + 50\% = 100\%$, so there you go.
But there's another system of probability, very different from what you and I are used to. It's a system where each event has an associated vector (or complex number), and the sum of the squared magnitudes of those vectors (complex numbers) is 1.
Quantum mechanics works according to this latter system, and for this reason, the complex numbers associated with events are what we often deal with. The wavefunction of a particle is just the distribution of these complex numbers over space. We have chosen to call these numbers the "probability amplitudes" merely as a matter of convenience.
The system of probability that QM follows is very different from what everyday experience would expect us to believe, and this has many mathematical consequences. It makes interference effects possible, for example, and such is only explainable directly with amplitudes. For this reason, amplitudes are physically significant--they are significant because the mathematical model for probability on the quantum scale is not what you and I are accustomed to.
Edit: regarding "just extra stuff under the hood." Here's a more concrete way of talking about the difference between classical and quantum probability.
Let $A$ and $B$ be mutually exclusive events. In classical probability, they would have associated probabilities $p_A$ and $p_B$, and the total probability of them occurring is obtained through addition, $p_{A \cup B} = p_A + p_B$.
In quantum probability, their amplitudes add instead. This is a key difference. There is a total amplitude $\psi_{A \cup B} = \psi_A + \psi_B$. and the squared magnitude of this amplitude--that is, the probability--is as follows:
$$p_{A \cup B} = |\psi_A + \psi_B|^2 = p_A + p_B + (\psi_A^* \psi_B + \psi_A \psi_B^*)$$
There is an extra term, yielding physically different behavior. This quantifies the effects of interference, and for the right choices of $\psi_A$ and $\psi_B$, you could end up with two events that have nonzero individual probabilities, but the probability of the union is zero! Or higher than the individual probabilities.
In quantum mechanics, the amplitude $\psi$, and not the probability $\mid\psi\mid^2$, is the quantity which admits the superposition principle. Notice that the dynamics of the physical system (Schrödinger equation) is formulated in terms of and is linear in the evolution of this object. Observe that working with superposition of $\psi$ also permits complex phases $e^{i\theta}$ to play a role. In the same spirit, the overlap of two systems is computed by investigation of the overlap of the amplitudes.