How to find the multiplier that produces a smaller output for every double values?
There've been a couple of experimental approaches; here's a proof that C = 1 - ε
, where ε
is machine epsilon (that is, the distance between 1
and the smallest representable number greater than 1
.)
We know that C < 1
, of course, so it makes sense to try C = 1 - ε/2
because it's the next representable number smaller than 1
. (The ε/2
is because C
is in the [0.5, 1)
bucket of representable numbers.) Let's see if it works for all A
.
I'm going to assume in this paragraph that 1 <= A < 2
. If both A
and AC
are in the "normal" region then it doesn't really matter what the exponent is, the situation will be the same with the exponent 2^0
. Now, that choice of C
obviously works for A=1
, so we are left with the region 1 < A < 2
. Looking at A = 1 + ε
, we see that AC
(the exact value, not the rounded result) is already greater than 1; and for A = 2 - ε
we see that it's less than 2. That's important, because if AC
is between 1 and 2, we know that the distance between AC
and round(AC)
(that is, rounding it to the nearest representable value) is at most ε/2
. Now, if A - AC < ε/2
, then round(AC) = A
which we don't want. (If A - AC = ε/2
then it might round to A
given the "ties to even" part of the normal FP rounding rules, but let's see if we can do better.) Since we've chosen C = 1 - ε/2
, we can see that A - AC = A - A(1 - ε/2) = A * ε/2
. Since that's greater than ε/2
(remember, A>1
), it's far enough away from A
to round away from it.
BUT! The one other value of A
we have to check is the minimum representable normal value, since there AC
is not in the normal range and so our "relative distance to nearest" rule doesn't apply. And what we find is that in that case A-AC
is exactly half of machine epsilon in the region. "Round to nearest, ties to even" kicks in and the product rounds back up to equal A
. Drat.
Going through the same thing with C = 1 - ε
, we see that round(AC) < A
, and that nothing else even comes close to rounding towards A
(we end up asking whether A * ε > ε/2
, which of course it is). So the punchline is that C = 1-ε/2
almost works but the boundary between normals and denormals screws us up, and C = 1-ε
gets us into the end zone.