Is there any legitimate use for Intel's RDRAND?
As pointed out in the other answer, RDRAND is seeded with true randomness. In particular, it frequently reseeds its internal CSPRNG with 128 bits of hardware-generated randomness, guaranteeing a reseed at least once every 511 * 128 bits. See section 4.2.5 of this doc:
https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide
So in your examples, you used a single 128-bit seed to generate 10 million random draws from rabbit_extract. In the RDRAND version, you had the equivalent of 2.5 million 128-bit draws, meaning that the CSPRING was reseeded at least 2,500,000/511 = 4,892 times.
So instead of 128 bits of entropy going into your rabbit example, there were at least 4,892*128 = 626,176 bits of entropy going into the RDRAND example.
That's much, much more entropy than you're going to get in 0.361 seconds without hardware support. That could matter if you're doing stuff where lots of real randomness is important. One example is Shamir secret sharing of large quantities of data -- not sure if there are others.
So in conclusion -- it's not for speed, it's for high security. The question of whether it's backdoored is troubling, of course, but you can always XOR it with other sources, and at the very least it's not hurting you.
RDRAND
is not just a PRNG. It is a whitened TRNG that is FIPS compliant. The difference is that you can rely on RDRAND
to contain quite a lot of actual entropy directly retrieved from the CPU. So the main use of RDRAND
is to supply entropy to OS/libraries/applications.
The only other good way for applications to retrieve entropy is usually using an OS supplied entropy source such as /dev/random
or /dev/urandom
(which usually draws the entropy from /dev/random
). However, that OS also requires to find the entropy somewhere. Usually tiny differences in disk and network access times are used for this (+ other semi-random input). These devices are not always present, and are not designed as sources of entropy; they are often not very good sources, nor are they very fast. So on systems that support it, RDRAND
is often used as an entropy source for the cryptographically secure random number generator of the operating system.
With regards to speed, especially for games, it is completely valid to use a (non-secure) PRNG. If you want to have a reasonable random seed then seeding it with the result of RDRAND
may be a good idea, although seeding it from the OS supplied RNG may be a more portable and even a more secure option (in case you don't fully trust Intel or the US).
Note that currently RDRAND is implemented using (AES) CTR_DRBG instead of a (less well analysed) stream cipher that was created for speed such as Rabbit, so it should come as no surprise that Rabbit is faster. Even more importantly, it also has to retrieve the entropy from the entropy source within the CPU before it can run.
Is there any legitimate use for Intel's RDRAND?
Yes.
Consider a Monte-Carlo simulation. It has no cryptographic needs, so it does not matter if its backdoored by the NSA.
Or should we stop using it altogether?
We can't answer that. That's a confluence use cases, requirements and personal preferences.
... Also we have at least one software CSPRNG faster than RDRAND, and the most widely used decent PRNG..."
Mersenne Twister may be faster for a word at at time after initialization and without Twists because its returning a word from the state array. But I doubt its as fast as RDRAND for a continuous stream. I know RDRAND can achieve theoretical limits based on bus width in a continuous stream.
According to David Johnston of Intel (who designed the circuit), that's something like 800+ MB/s. See DJ's answer at What is the latency and throughput of the RDRAND instruction on Ivy Bridge?.
So, we have a widespread paranoia that RDRAND was made to embed NSA backdoors into everybody's software cryptography.
Paranoid folks have at least two choices. First, they can forgo using RDRAND and RDSEED. Second, they can use the output of RDRAND and RDSEED to seed another generator, and then use the output of the second generator. I believe the Linux kernel takes the second approach.