Hacks for clamping integer to 0-255 and doubles to 0.0-1.0?

This is a trick I use for clamping an int to a 0 to 255 range:

/**
 * Clamps the input to a 0 to 255 range.
 * @param v any int value
 * @return {@code v < 0 ? 0 : v > 255 ? 255 : v}
 */
public static int clampTo8Bit(int v) {
    // if out of range
    if ((v & ~0xFF) != 0) {
        // invert sign bit, shift to fill, then mask (generates 0 or 255)
        v = ((~v) >> 31) & 0xFF;
    }
    return v;
}

That still has one branch, but a handy thing about it is that you can test whether any of several ints are out of range in one go by ORing them together, which makes things faster in the common case that all of them are in range. For example:

/** Packs four 8-bit values into a 32-bit value, with clamping. */
public static int ARGBclamped(int a, int r, int g, int b) {
    if (((a | r | g | b) & ~0xFF) != 0) {
        a = clampTo8Bit(a);
        r = clampTo8Bit(r);
        g = clampTo8Bit(g);
        b = clampTo8Bit(b);
    }
    return (a << 24) + (r << 16) + (g << 8) + (b << 0);
}

Note that your compiler may already give you what you want if you code value = min (value, 255). This may be translated into a MIN instruction if it exists, or into a comparison followed by conditional move, such as the CMOVcc instruction on x86.

The following code assumes two's complement representation of integers, which is usually a given today. The conversion from Boolean to integer should not involve branching under the hood, as modern architectures either provide instructions that can directly be used to form the mask (e.g. SETcc on x86 and ISETcc on NVIDIA GPUs), or can apply predication or conditional moves. If all of those are lacking, the compiler may emit a branchless instruction sequence based on arithmetic right shift to construct a mask, along the lines of Boann's answer. However, there is some residual risk that the compiler could do the wrong thing, so when in doubt, it would be best to disassemble the generated binary to check.

int value, mask;

mask = 0 - (value > 255);  // mask = all 1s if value > 255, all 0s otherwise
value = (255 & mask) | (value & ~mask);

On many architectures, use of the ternary operator ?: can also result in a branchless instruction sequences. The hardware may support select-type instructions which are essentially the hardware equivalent of the ternary operator, such as ICMP on NVIDIA GPUs. Or it provides CMOV (conditional move) as in x86, or predication as on ARM, both of which can be used to implement branch-less code for ternary operators. As in the previous case, one would want to examine the disassembled binary code to be absolutely sure the resulting code is without branches.

int value;

value = (value > 255) ? 255 : value;

In case of floating-point operands, modern floating-point units typically provide FMIN and FMAX instructions which map straight to the C/C++ standard math functions fmin() and fmax(). Alternatively fmin() and fmax() may be translated into a comparison followed by a conditional move. Again, it would be prudent to examine the generated code to make sure it is branchless.

double value;

value = fmax (fmin (value, 1.0), 0.0);

Hacks for clamping integer to 0-255 and doubles to 0.0-1.0?

Tags:

Floating Point

Language Agnostic

Math

Bit Manipulation

Integer Arithmetic

Related

Recent Posts