How does an ARM MCU run faster than the external crystal?
This doesn't have anything to do with the core being an ARM processor; it's about how the clocking circuitry works:
In many systems like microcontrollers, RF chips, audio chips, … you need to generate a faster clock that is an exact multiple of some reference clock (for example, an external crystal).
You do that by having a voltage-controlled oscillator (VCO) that you can adjust in frequency by in- or decreasing a control voltage.
Now, by just setting any control voltage, you can bring that to oscillate at a frequency roughly in the right "ballpark", but not at an exact multiple of the input frequency. Especially, VCOs can be a bit drifty, so that frequency will also continously "wander" all over the place. You need to control that oscillator by comparing it to the reference oscillator.
The way to do that is by employing a Phase-Locked Loop. The idea is simple:
- Divide the frequency that comes out of the VCO by a factor \$N\$; that's the factor that we want the VCO to be faster than the reference. Doing that is easy: You can, for example, simply use a digital counter that counts to N and only then changes the output.
- Compare that \$f_\text{VCO}/N\$ clock with the reference clock at \$f_\text{ref}\$. If one is faster than the other, adjust the frequency accordingly. You can do that in a digital way by just XOR'ing both clocks – ideally, if they are identical, the result is a constant 0, but if one is faster than the other, then there will be a growing amount of times when the XOR of both clocks is 1; slow down or speed up the VCO accordingly.
The above is a control loop, locked to the phase of – hence the name.
For "rich" microcontrollers, which have a lot of peripherals and hence benefit from having multiple clocks internally, it's usual to have at least 1 PLL. The ATMega328 is a bit strange in that respect: It's a relatively power-hungry, relatively peripheral-rich microcontroller that still doesn't have a PLL.
Some devices have a PLL in them that can multiply the crystal frequency to higher frequencies. The ATMega328 does not have a PLL, it uses the crystal directly.