[Crypto] Is a LFSR sufficient for post-processing a biased TRNG?
Solution 1:
I'll assume the right 16 in the question's figure is to tell that the LFSR is unloaded 16 bits at a time; and unless otherwise stated I'll assume that's done every 16 clock cycles of the LFSR.
Is the LFSR sufficient to unbias the input?
Mostly yes, but that's not enough in crypto. It's desired to make the output undistinguishable from random (no just unbiased) for an adversary who knows the design of the system (here, that there is a 16-bit LFSR, and even which). That's per Kerckhoffs's (second) principle.
Is a LFSR sufficient for post-processing a biased TRNG?
Positively no, in an unspecified cryptographic context, and assuming full use of the output (see first paragraph).
Problem is that from the output of the LFSR, and it's design (feedback polynomial mostly), and perhaps it's initial state (see below), it's trivial to get back at the exact input of the LFSR, and from that take advantage of the bias for a distinguisher. If the LFSR output is used for something critical (e.g. generating a One Time Pad), that can be a disaster.
If the LFSR is in "scrambler" configuration, an adversary can feed the LFSR's output in the corresponding "descrambler", and that will output what was the scrambler's input. Exactly so if the descrambler starts in sync with the scrambler, or after the first 16 bits otherwise (a descrambler for and $n$-bit LFSR self-synchronizes after $n$ bits). In other words, that descrambler trick, that the adversary can apply, undoes what the LFSR in the question's generator did, thus the LFSR is not sufficient.
This is illustrated below. On top of the picture is the question's LFSR, simplified (serial output scrambled, $n=4$ stages). The bottom is the descrambler, added/simulated by the attacker. It should be apparent that after $n=4$ clocks, the bottom circuit is in sync with the top's, and descrambled is the input, with whatever detectable bias.
If the LFSR was in some other configuration (descrambler or "additive scrambler"), an adversary needs to guess the LFSR's initial state, if not known in the context (e.g. zero at reset). Even with no clue, attack remains easy: there are only $2^{16}$ possible initial states, and we can select whatever one gives the most marked bias for the reconstructed input, again making the LFSR not sufficient.
For something that works and still only uses a relatively small LFSR, we want to heavily undersample ("decimate") the output of a LFSR in scrambler configuration; e.g. keep one output bit out of 64 (almost equivalently, unload the 16 bits of the LFSR every 1024 clocks instead of 16). The general idea is the entropy is gathered/mixed in the LFSR between outputs. There's a trade-doff between quality of the output and output rate, controlled by the undersampling/decimation rate.
There's a simple theory for how much decimation is too little: if there is $h<1$ bit of entropy per bit out of the sampler, then anything less than $1/h$ bits into the LFSR stage for each bit out is too few. So if when in an experiment we capture the output of the sampler and feed that to a lossless compressor like bzip2, and it compresses by a factor of $c$ (e.g. $c=4$ for 100MiB compressed to 25MiB), then any decimation less then $c=4$ sampled bits in for one bit out of the LFSR stage is too little. Also, if at the output of the sampler we have bits with average $a\ne1/2$, anything less than $c=-1/(a\log_2(a)+(1-a)\log_2(1-a))$ is too few (beware this formula does not account for correlation between bits, and thus is next to useless except for extremely biased input).
Unfortunately I have no simple theory for how much decimation is enough, or how LFSR size matters! The standard recommendation is to be extremely conservative if the LFSR is the only mean to condition the sampler's output, or (better) conservative and use the LFSR to seed a Cryptographically Secure Pseudo Random Number Generator, using the output of that in the end. Also see last paragraph about the need to detect field failures.
I wish I knew exactly what condition (if any) on the LFSR in scrambler mode is necessary to insure no long term bias in the output (assuming no feedback from output to input, and independent bits in input [or even much weaker and plausible assumptions]). But even a 1-stage LFSR in scrambler configuration (feedback polynomial $x+1$, single D gate, 2-input XOR including input) does that. Followed by ample decimation, that's one of the simplest correct RNG post-processing.
Would unbiasing the input with e.g. a Von Neumann extractor be beneficial?
Most likely it would be beneficial. It would even work perfectly if the bits at the output of the sampler are independent. But
- Absent that insurance, there can remain a detectable bias.
- Output rate becomes irregular, with no minimum, and that can be a serious problem¹.
The first issue could be dealt with by cascading several Von Neumann extractors, but each state halves the average thruoutput and worsens the second issue. An LFSR in scrambler configuration followed by heavy under-sampling is more robust, and at least the throughput is predictable.
Beware that the hardest part in designing a Cryptographically Secure True Random Number Generator is not making it secure² when it works, but detecting when it does not work as intended (like because an attacker pours liquid nitrogen, or just some mundane corollary of Murphy's law), so that the overall gizmo remains secure.
¹ I once was involved with finding why some Smart Card accepting devices sometime got bricked on the field with some type of cards, degenerating into expensive consequences. It ended up being an overreaction by the accepting device to a failure in the ISO/IEC 7816-3 protocol, following a timeout: the card was sometime too slow to answer. According to the Smart Card manufacturer, that was due to random operating delays induced by a Von Neumann extractor (I doubt this is the full story of the card misbehavior but trust it involved irregular RNG throughput).
² It's still hard! Potential problems among many include unwanted feedback loops from conditioned output to the source/sampler.
Solution 2:
No, not really.
- Your schematic is missing a crucial component called a decimator. The output of coupled ring oscillators (ROs) in the real world tends to not have as much jitter as in idealised literature. Essentially, there's little jitter as they tend to synch together under one system clock. It can look like:-
The output from your NOR gates will be highly correlated with long runs of ones or zeros. You decimate the raw bit feed from the ROs to concentrate entropy. You'd need either an XOR or counter decimator. XORs XOR $n$ pulses together, whilst a counter decimator will emit a single pulse per $n$ pulses. In the literature you see them as $\div K$ or $KD$. Decimation ratios of 1024 are not uncommon. An example is:-
That's a 20 bit decimator (divide by up to one million). From True random number generators for cryptography :Design, securing and evaluation.
- And your scheme is not a proper TRNG. A true TRNG has to output less entropy than it produces upstream. The basic idea of an extractor is to compute k output bits with high randomness from $n>k$ input bits with less randomness. Feeding a LFSR directly with $n$ pulses (with high correlation, low entropy) produces exaclty the same number of pulses, i.e. $k=n$. The output will look random (with non deterministic output rate) but will only be pseudo random.
Also, outputting a 16 bit wide bus is nonsensical. We would aim for a singular bitstream.
This answer will need edits.