What is the true maximum (and minimum) value of Random.nextGaussian()?

Random Implementation

The most important thing you will have to know for this answer is the implementation of Random.nextGaussian:

synchronized public double nextGaussian() {
    // See Knuth, ACP, Section 3.4.1 Algorithm C.
    if (haveNextNextGaussian) {
        haveNextNextGaussian = false;
        return nextNextGaussian;
    } else {
        double v1, v2, s;
        do {
            v1 = 2 * nextDouble() - 1; // between -1 and 1
            v2 = 2 * nextDouble() - 1; // between -1 and 1
            s = v1 * v1 + v2 * v2;
        } while (s >= 1 || s == 0);
        double multiplier = StrictMath.sqrt(-2 * StrictMath.log(s)/s);
        nextNextGaussian = v2 * multiplier;
        haveNextNextGaussian = true;
        return v1 * multiplier;
    }
}

And the implementation of Random.nextDouble:

public double nextDouble() {
    return (double) (((long)(next(26)) << 27) + next(27)) / (1L << 53);
}

First, I want to draw your attention to the fact that nextGaussian generates 2 values at a time, and that depending on whether you know how many nextGaussian calls have passed since the last time the seed was set, you may be able to use a slightly lower max value for odd vs even numbers of calls. From now on, I'm going to call the two maximums v1_max and v2_max, referring to whether the value was generated by v1 * multiplier or v2 * multiplier.

The Answer

With that out of the way let's cut straight to the chase and explain later:

|      |Value             |Seed*          |
|------|------------------|---------------|
|v1_max|7.995084298635286 |97128757896197 |
|v2_max|7.973782613935931 |10818416657590 |
|v1_min|-7.799011049744149|119153396299238|
|v2_min|-7.844680087923773|10300138714312 |
* Seeds for v2 need to have nextGaussian called twice before you see the value listed.

A Closer Look at nextGaussian

The answers by @KaptainWutax and @Marco13 have already gone into detail about the same things, but I think seeing things on a graph makes things clearer. Let's focus on v1_max, the other three values hold very similar logic. I'm going to plot v1 on the x-axis, v2 on the y-axis and v1 * multiplier on the z-axis.

Graph

Our eyes immediately jump to the maximum point at v1 = 0, v2 = 0, v1 * multiplier = infinity. But if you notice in the do-while loop, it explicitly disallows this situation. Therefore, it's clear from the graph that the actual v1_max must have a slightly higher v1 value, but not much higher. Also noteworthy is that for any v1 value > 0, the maximum v1 * multiplier is at v2 = 0.

Our method to find v1_max will be to count up v1 from zero (or, more specifically, counting the nextDouble which generated it up from 0.5, incrementing in steps of 2^-53, as per the implementation of nextDouble). But, just knowing v1, how do we get the other variables, and the v1 * multiplier for that v1?

Reversing nextDouble

It turns out that knowing the output of a nextDouble call is enough to determine the seed of the Random object that generated it at the time. Intuitively, this is because looking at the nextDouble implementation, it "looks like" there should be 2^54 possible outputs - but the seed of Random is only 48-bit. Furthermore, it's possible to recover this seed in much faster time than brute force.

I initially tried a naive approach based on using next(27) directly to get bits of the seed then brute-forcing the remaining 21 bits, but this proved too slow to be useful. Then SicksonFSJoe gave me a much faster method to extract a seed from a single nextDouble call. Note that to understand the details of this method you will have to know the implementation of Random.next, and a little modular arithmetic.

private static long getSeed(double val) {
    long lval = (long) (val * (1L << 53));
    // let t = first seed (generating the high bits of this double)
    // let u = second seed (generating the low bits of this double)
    long a = lval >> 27; // a is the high 26 bits of t
    long b = lval & ((1 << 27) - 1); // b is the high 27 bits of u

    // ((a << 22) + c) * 0x5deece66d + 0xb = (b << 21) + d (mod 2**48)
    // after rearranging this gives
    // (b << 21) - 11 - (a << 22) * 0x5deece66d = c * 0x5deece66d - d (mod 2**48)
    // and because modular arithmetic
    // (b << 21) - 11 - (a << 22) * 0x5deece66d + (k << 48) = c * 0x5deece66d - d
    long lhs = ((b << 21) - 0xb - (a << 22) * 0x5deece66dL) & 0xffffffffffffL;

    // c * 0x5deece66d is 56 bits max, which gives a max k of 375
    // also check k = 65535 because the rhs can be negative
    for (long k = 65535; k != 376; k = k == 65535 ? 0 : k + 1) {
        // calculate the value of d
        long rem = (0x5deece66dL - (lhs + (k << 48))) % 0x5deece66dL;
        long d = (rem + 0x5deece66dL) % 0x5deece66dL; // force positive
        if (d < (1 << 21)) {
            // rearrange the formula to get c
            long c = lhs + d;
            c *= 0xdfe05bcb1365L; // = 0x5deece66d**-1 (mod 2**48)
            c &= 0xffffffffffffL;
            if (c < (1 << 22)) {
                long seed = (a << 22) + c;
                seed = ((seed - 0xb) * 0xdfe05bcb1365L) & 0xffffffffffffL; // run the LCG forwards one step
                return seed;
            }
        }
    }

    return Long.MAX_VALUE; // no seed
}

Now we can get the seed from a nextDouble, it makes sense that we can iterate over v1 values rather than seeds.

Bringing it All Together

An outline of the algorithm is as follows:

  1. Initialize nd1 (stands for nextDouble 1) to 0.5
  2. While the upper bound and our current v1_max haven't crossed, repeat steps 3-7
  3. Increment nd1 by 2^-53
  4. Compute seed from nd1 (if it exists), and generate nd2, v1, v2 and s
  5. Check the validity of s
  6. Generate a gaussian, compare with v1_max
  7. Set a new upper bound by assuming v2 = 0

And here is a Java implementation. You can verify the values I gave above for yourself if you want.

public static void main(String[] args) {
    double upperBound;
    double nd1 = 0.5, nd2;
    double maxGaussian = Double.MIN_VALUE;
    long maxSeed = 0;
    Random rand = new Random();
    long seed;
    int i = 0;
    do {
        nd1 += 0x1.0p-53;
        seed = getSeed(nd1);

        double v1, v2, s;
        v1 = 2 * nd1 - 1;

        if (seed != Long.MAX_VALUE) { // not no seed
            rand.setSeed(seed ^ 0x5deece66dL);
            rand.nextDouble(); // nd1
            nd2 = rand.nextDouble();

            v2 = 2 * nd2 - 1;
            s = v1 * v1 + v2 * v2;
            if (s < 1 && s != 0) { // if not, another seed will catch it
                double gaussian = v1 * StrictMath.sqrt(-2 * StrictMath.log(s) / s);
                if (gaussian > maxGaussian) {
                    maxGaussian = gaussian;
                    maxSeed = seed;
                }
            }
        }

        upperBound = v1 * StrictMath.sqrt(-2 * StrictMath.log(v1 * v1) / (v1 * v1));
        if (i++ % 100000 == 0)
            System.out.println(maxGaussian + " " + upperBound);
    } while (upperBound > maxGaussian);
    System.out.println(maxGaussian + " " + maxSeed);
}

One final catch to watch out for, this algorithm will get you the internal seeds for the Random. To use it in setSeed, you have to xor them with the Random's multiplier, 0x5deece66dL (which has already been done for you in the table above).


So everything I will say here is purely theoretical, and I am still working on a GPU program to scan the whole seed base.

The nextGaussian() method is implemented as such.

private double nextNextGaussian;
private boolean haveNextNextGaussian = false;

 public double nextGaussian() {

   if (haveNextNextGaussian) {

     haveNextNextGaussian = false;
     return nextNextGaussian;

   } else {

     double v1, v2, s;

     do {
       v1 = 2 * nextDouble() - 1;   // between -1.0 and 1.0
       v2 = 2 * nextDouble() - 1;   // between -1.0 and 1.0
       s = v1 * v1 + v2 * v2;
     } while (s >= 1 || s == 0);

     double multiplier = StrictMath.sqrt(-2 * StrictMath.log(s)/s);
     nextNextGaussian = v2 * multiplier;
     haveNextNextGaussian = true;
     return v1 * multiplier;

   }

 }

The most interesting part has to be at the end, [return v1 * multiplier]. Because v1 cannot be larger than 1.0D, we need to find a way to increase the size of the multiplier, which is implemented as follows.

double multiplier = StrictMath.sqrt(-2 * StrictMath.log(s)/s);

The only variable being "s", it is safe to establish that the lower "s" is, the bigger the multiplier will get. All good? Let's keep going.

 do {
   v1 = 2 * nextDouble() - 1;   // between -1.0 and 1.0
   v2 = 2 * nextDouble() - 1;   // between -1.0 and 1.0
   s = v1 * v1 + v2 * v2;
 } while (s >= 1 || s == 0);

This tells us that "s" has to belong to the ]0,1[ number set and that the lowest value we're looking for is just a little bigger than zero. "S" is declared with the sum of squares of "v1" and "v2". To get the smallest theoretical value, v2 needs to be zero and v1 needs to be as small as it can possibly get. Why "theoretical"? Because they are generated from nextDouble() calls. There is no guarantee that the seed base contains those 2 consecutive numbers.

Let's have fun now!

Lowest value "v1" can hold is the double's epsilon, which is 2^(-1022). Making our way back, to get such a number, nextDouble would need to generate (2^(-1022) + 1) / 2.

That is...very very very disturbing. I'm not an expert, but I am pretty sure many bits are gonna get lost, and floating-point errors are to be expected.

It is probably(most definitely) impossible for a nextDouble to generate such a value, but the goal is to find a value as close as possible to that number.

Just for the fun of it, let's do the full math to find the answer. StrictMath.log() is implemented as the natural log. I haven't looked into it's precision, but let's assume there was no limitations on that level. The highest nextGaussian would be calculated as...

= (-2 * ln(v1 * v1) / (v1 * v1)) * v1 
= (-2 * ln(EPSILON^2) / (EPSILON^2)) * EPSILON

where EPSILON is equal to 2^(-1022).

Believe it or not, I could barely find any calculator that accepted such small numbers, but I finally opted for this high precision calculator.

By plugging in this equation,

(-2 * ln((2^(-1022))^2) / ((2^(-1022))^2)) * (2^(-1022))

I got,

1.273479378356503041913108844696651886724617446559145569961266215283953862086306158E+311

Pretty big huh? Well...it's definitely not going to be that big...but it's nice to take into account. Hope my reasoning makes sense and don't be shy to point out any mistake I made.

As I said at the start, I am working on a program to bruteforce all seeds and find the actual lowest value. I'll keep you updated.

EDIT :

Sorry for the late response. After bruteforcing 2^48 seeds in about 10 hours, I found the EXACT same answers as Earthcomputer.

Tags:

Java

Random