Very fast 3D distance check?

You can leave out the square root because for all positive (or really, non-negative) numbers x and y, if sqrt(x) < sqrt(y) then x < y. Since you're summing squares of real numbers, the square of every real number is non-negative, and the sum of any positive numbers is positive, the square root condition holds.

You cannot eliminate the multiplication, however, without changing the algorithm. Here's a counterexample: if x is (3, 1, 1) and y is (4, 0, 0), |x| < |y| because sqrt(1*1+1*1+3*3) < sqrt(4*4+0*0+0*0) and 1*1+1*1+3*3 < 4*4+0*0+0*0, but 1+1+3 > 4+0+0.

Since modern CPUs can compute a dot product faster than they can actually load the operands from memory, it's unlikely that you would have anything to gain by eliminating the multiply anyway (I think the newest CPUs have a special instruction that can compute a dot product every 3 cycles!).

I would not consider changing the algorithm without doing some profiling first. Your choice of algorithm will heavily depend on the size of your dataset (does it fit in cache?), how often you have to run it, and what you do with the results (collision detection? proximity? occlusion?).


I'm disappointed that the great old mathematical tricks seem to be getting lost. Here is the answer you're asking for. Source is Paul Hsieh's excellent web site: http://www.azillionmonkeys.com/qed/sqroot.html . Note that you don't care about distance; you will do fine for your sort with square of distance, which will be much faster.


In 2D, we can get a crude approximation of the distance metric without a square root with the formula:

distanceapprox (x, y) = (1 + 1/(4-2*√2))/2 * min((1 / √2)*(|x|+|y|), max (|x|, |y|))

which will deviate from the true answer by at most about 8%. A similar derivation for 3 dimensions leads to:

distanceapprox (x, y, z) = (1 + 1/4√3)/2 * min((1 / √3)*(|x|+|y|+|z|), max (|x|, |y|, |z|))

with a maximum error of about 16%.

However, something that should be pointed out, is that often the distance is only required for comparison purposes. For example, in the classical mandelbrot set (z←z2+c) calculation, the magnitude of a complex number is typically compared to a boundary radius length of 2. In these cases, one can simply drop the square root, by essentially squaring both sides of the comparison (since distances are always non-negative). That is to say:

    √(Δx2+Δy2) < d is equivalent to Δx2+Δy2 < d2, if d ≥ 0

I should also mention that Chapter 13.2 of Richard G. Lyons's "Understanding Digital Signal Processing" has an incredible collection of 2D distance algorithms (a.k.a complex number magnitude approximations). As one example:

Max = x > y ? x : y;

Min = x < y ? x : y;

if ( Min < 0.04142135Max )

|V| = 0.99 * Max + 0.197 * Min;

else

|V| = 0.84 * Max + 0.561 * Min;

which has a maximum error of 1.0% from the actual distance. The penalty of course is that you're doing a couple branches; but even the "most accepted" answer to this question has at least three branches in it.

If you're serious about doing a super fast distance estimate to a specific precision, you could do so by writing your own simplified fsqrt() estimate using the same basic method as the compiler vendors do, but at a lower precision, by doing a fixed number of iterations. For example, you can eliminate the special case handling for extremely small or large numbers, and/or also reduce the number of Newton-Rapheson iterations. This was the key strategy underlying the so-called "Quake 3" fast inverse square root implementation -- it's the classic Newton algorithm with exactly one iteration.

Do not assume that your fsqrt() implementation is slow without benchmarking it and/or reading the sources. Most modern fsqrt() library implementations are branchless and really damned fast. Here for example is an old IBM floating point fsqrt implementation. Premature optimization is, and always will be, the root of all evil.


What I usually do is first filter by Manhattan distance

float CBox::Within3DManhattanDistance( Vec3 c1, Vec3 c2, float distance )
{
    float dx = abs(c2.x - c1.x);
    float dy = abs(c2.y - c1.y);
    float dz = abs(c2.z - c1.z);

    if (dx > distance) return 0; // too far in x direction
    if (dy > distance) return 0; // too far in y direction
    if (dz > distance) return 0; // too far in z direction

    return 1; // we're within the cube
}

Actually you can optimize this further if you know more about your environment. For example, in an environment where there is a ground like a flight simulator or a first person shooter game, the horizontal axis is very much larger than the vertical axis. In such an environment, if two objects are far apart they are very likely separated more by the x and y axis rather than the z axis (in a first person shooter most objects share the same z axis). So if you first compare x and y you can return early from the function and avoid doing extra calculations:

float CBox::Within3DManhattanDistance( Vec3 c1, Vec3 c2, float distance )
{
    float dx = abs(c2.x - c1.x);
    if (dx > distance) return 0; // too far in x direction

    float dy = abs(c2.y - c1.y);
    if (dy > distance) return 0; // too far in y direction

    // since x and y distance are likely to be larger than
    // z distance most of the time we don't need to execute
    // the code below:

    float dz = abs(c2.z - c1.z);
    if (dz > distance) return 0; // too far in z direction

    return 1; // we're within the cube
}

Sorry, I didn't realize the function is used for sorting. You can still use Manhattan distance to get a very rough first sort:

float CBox::ManhattanDistance( Vec3 c1, Vec3 c2 )
{
    float dx = abs(c2.x - c1.x);
    float dy = abs(c2.y - c1.y);
    float dz = abs(c2.z - c1.z);

    return dx+dy+dz;
}

After the rough first sort you can then take the topmost results, say the top 10 closest players, and re-sort using proper distance calculations.


Here's an equation that might help you get rid of both sqrt and multiply:

max(|dx|, |dy|, |dz|) <= distance(dx,dy,dz) <= |dx| + |dy| + |dz|

This gets you a range estimate for the distance which pins it down to within a factor of 3 (the upper and lower bounds can differ by at most 3x). You can then sort on, say, the lower number. You then need to process the array until you reach an object which is 3x farther away than the first obscuring object. You are then guaranteed to not find any object that is closer later in the array.

By the way, sorting is overkill here. A more efficient way would be to make a series of buckets with different distance estimates, say [1-3], [3-9], [9-27], .... Then put each element in a bucket. Process the buckets from smallest to largest until you reach an obscuring object. Process 1 additional bucket just to be sure.

By the way, floating point multiply is pretty fast nowadays. I'm not sure you gain much by converting it to absolute value.

Tags:

Algorithm

C++