Can random number generators be used in compression?

"In the case it just happened to be the same" is the key. It will (almost) never happen. Say you have million bit strings. There are $2^{1000000}\approx 10^{301030}$ of them. If the string happened to be $\pi$, you would have a phenomenal amount of compression. That happens very rarely. If your random number generator is good, it will happen just as rarely. The compression we use reflects the non-randomness in the files we compress. There are tremendous correlations in text (because it is generated in natural language, not random) or photos (big areas are close in color). Compression takes advantage of that.


The randomness isn't essential in your argument; what you're really talking about is the idea of a function with small inputs having large outputs. So one idea is instead of doing random numbers, take a huge library of videos or images or code, etc. and analyze them to find frequently recurring patterns, and then make a list of those patterns. The location in that list is similar to the seed for a random number generator.

This has been done before; I went to a talk recently where they said the most common pictures in photographs were sharp lines, so they encoded ever instance of a line-like object (viewed as a subset of a five by five grid) in a list, and then decomposed photographs using these masks. It was really interesting and fun! I think Morse theory had been used similarly, but I'm not sure.