What is the difference between Simple Power Analysis and Differential Power Analysis?
Simple Power Analysis (SPA) involves measuring variations in power consumption of a device as it performs an operation, in order to discover information about secret key material or data. This is achieved by mapping certain operation types to consumption patterns. For example, a series of exclusive-OR operations exhibits a different trace on an oscilloscope to a series of multiplication operations.
A good example of this is RSA, which has to perform large multiplications, and therefore leaks information about the internal state of a large integer multiplication via the pattern of operations it performs.
Another trick involved in SPA capitalises on the fact that digital signals are represented physically by high and low voltages, and therefore a 1 being signalled uses more power than a 0, for the time period that that bit is being transmitted. If accurate enough analysis can be done, we can compute statistical models that predict the contents of a bit-stream within a reasonable margin of error.
Whilst relatively simple single-purpose systems (e.g. smart cards) are usually only performing a single operation, more complex systems are usually performing several operations at once. A modern desktop computer uses technologies like DMA, which allow hardware to access system memory directly, without interrupting the processor. All of these signals run in parallel, generating a huge amount of noise. On top of that, mechanical hard drives contain components that generate various current spikes, adding to the noise. Trying to identify the statistical patterns of a single 133MHz+ data signal in such a system is exceedingly difficult!
This is where Differential Power Analysis (DPA) comes in. In order to improve the chances of success, DPA involves analysis of power consumption when normal non-cryptographic operations are being done, then further analysis during the cryptographic operations. The two statistical models are compared, in order to "subtract" the noise from the signal. This technique is actually common among audio engineers, who sample background noise (e.g. hiss) on its own, do an FFT analysis of it, then produce an EQ-like filter model that reduces the levels for those frequencies. It's more complex in DPA (often FFT alone is too primitive) but the principle remains true.
Of course, this is still a very difficult model to compute, but even a 51% successful prediction rate is useful when dealing with cryptographic keys; bruteforcing a 128-bit key is orders of magnitude easier when you can prioritise higher-confidence predicted bits - a 51% success rate gives us an effective expected key space of 125.5 bits instead of 128 bits.
FPGAs can't really be attacked with SPA because they're "parallel" in nature. They don't really work like a normal microprocessor, instead they're a series of parallel logic gates that all operate in unison on a single common clock signal. As such, SPA can't identify separate "signals" because every internal component does the same thing on every clock cycle. Microprocessors, on the other hand, often have separate internal die sections that can be independently clocked, and which route and multiplex signals in real-time. This is all somewhat of a generalisation, since some more modern complex FPGAs do have microprocessor-like internals alongside the static logic, which may be vulnerable to SPA.