Why is it possible to have undistorted power amplifiers?
The magical term is "negative feedback". Even with nonlinear amplifiers the overall feedback from the output to an error amplifier can correct these nonlinearities. This can result in highly linear systems, even when the individual compontents are not.
You can think of it like this:
The output amplitude is scaled to the level of the input signal (a simple resistor divider has barely any problems with nonlinearity) and is feed back to the input stage. There this down-scaled output signal is compared with the input signal. If they don't match, the input stage can correct the output and by that eliminate the distortion.
simulate this circuit – Schematic created using CircuitLab
In the schematic above the feedback is 1:1, it is not scaled down. This means the output voltage will be the same as the input voltage, but you can draw a lot more current.
If you would put a 2:1 voltage divider in the feedback path, the output voltage would be double the input voltage.
Negative feedback is discussed in other answers and it is the usual modern solution.
Then again, there is at least one more approach for making a linear power amplifier from non-linear elements:
Pre-distorting the input signal in a way that compensates for and/or cancels the non-linearity of the powerful output elements or the whole last stage.
A good (but not the only) example is how class-D amplifiers work. The signal is first used to PWM some carrier frequency, then fed to profoundly non-linear power stage. The filtered output is more or less linear.
Another examples are contemporary to the "10% rule" and the thermionic valves:
the signal is inverted between two stages of similar non-linearity. The first stage distorts the signal in some way, the second stage distorts it in more or less the opposite manner.
The signal is inverted. Both inverted and non-inverted paths are fed into the pair of tubes (or transistors) working in the opposite directions of the class-A or class-AB last stage. The non-linearities of the two elements, cancel each other to a great extent.
In the bad old days, before Harry Black, tube amplifiers were run open loop. They were already fairly linear, just about linear enough for audio, but not linear enough to amplify multiple telephone carriers frequency multiplexed onto a line, without distortion from one to the other.
His first thought was to detect the difference between the input and a fraction of the output, then apply the right amount of gain to it and add it to the output as a correction. Better, but because of the matching in amplitude and delay needed, it never really worked well enough in the real world to be worth it. It's only now making a comeback as DSP enables real time adjustable matching, and it's capable of very good power efficiency.
Then he came up with detecting the difference between the input and a fraction of the output, and using that with a very very large gain as the output. Sounds improbable if you say it like that, which was perhaps why it wasn't his first thought. Because the gain doesn't have to be right, just huge, it was the one that worked and, once the theory to handle stability had been worked out, took over the world.