Surprising result using awk floating point arithmetic

You're being bitten by a floating point arithmetic issue.

$ awk 'BEGIN { printf "%.17f\n", 99.15-20.85 }'
78.30000000000001137

http://floating-point-gui.de/ might be able to help clear things up for you - it tries to explain what floating point is, and why arithmetic errors like this happen, and how to avoid these sorts of issues in your programs.


The floating point numbers 99.15 and 28.85 and 78.30 don't have exact IEEE 754 binary representations. You can see this with a C program that does the same calculation:

#include <stdio.h>
int
main(int ac, char **av)
{
        float a = 99.15;
        float b = 20.85;
        float c;

        printf("a = %.7f\n", a);
        printf("b = %.7f\n", b);
        c = a - b;
        printf("c = %.7f\n", c);

        return 0;
}

I get these answers on by an x86 and an x86_64 machine probably because they both do IEEE 754 floating point math:

a = 99.1500015 b = 20.8500004 c = 78.3000031

Here's what happens: floating point numbers get represented with a sign bit (positive or negative), a number of bits, and an exponent. Not every rational number (which is what a "floating point" number is in this context) can be represented exactly in IEEE 754 format. So, the hardware gets as close as it can. Unfortunately, in your test case, the hardware doesn't get an exact representation of any of the 3 values. It won't even if you use double instead of float, which awk probably does.

Here's a further explanation of the spacing of floating point numbers that have exact binary representations.

You can probably find some values that pass your test and others that don't. There's a lot more that don't.

Usually people solve a floating point problem by doing something like this:

if (abs(c) <= epsilon) {
    // We'll call it equal
} else {
    // Not equal
}

That's a lot harder to do in awk. If you're doing money with monetary units and two significant digits of sub-unit (dollars and cents, say), you should just carry out all calculations in the sub-units (cents in the USA). Do not use floating point to do monetary calculations. You will only find yourself regretting that decision.


You can avoid such kind of mistakes by numbers formating:

awk -F, '{
    if (NR != 1 && sprintf(CONVFMT,prior_tot-$1) != $2)
        {print "Arithmetic fail..." $0}
    else
        {print "OK"}
    prior_tot = $2}'

Tags:

Awk

Gawk