Difference between float and decimal data type
Not just specific to MySQL, the difference between float and decimal types is the way that they represent fractional values. Floating point types represent fractions in binary, which can only represent values as {m*2^n | m, n Integers}
. values such as 1/5 cannot be precisely represented (without round off error). Decimal numbers are similarly limited, but represent numbers like {m*10^n | m, n Integers}
. Decimals still cannot represent numbers like 1/3, but it is often the case in many common fields, like finance, that the expectation is that certain decimal fractions can always be expressed without loss of fidelity. Since a decimal number can represent a value like $0.20
(one fifth of a dollar), it is preferred in those situations.
This is what I found when I had this doubt.
mysql> create table numbers (a decimal(10,2), b float);
mysql> insert into numbers values (100, 100);
mysql> select @a := (a/3), @b := (b/3), @a * 3, @b * 3 from numbers \G
*************************** 1. row ***************************
@a := (a/3): 33.333333333
@b := (b/3): 33.333333333333
@a + @a + @a: 99.999999999000000000000000000000
@b + @b + @b: 100
The decimal did exactly what's supposed to do on this cases, it truncated the rest, thus losing the 1/3 part.
So for sums the decimal is better, but for divisions the float is better, up to some point, of course. I mean, using DECIMAL will not give you a "fail proof arithmetic" in any means.
Hope this helps.
A "float" in most environments is a binary floating-point type. It can accurately store base-2 values (to a certain point), but cannot accurately store many base-10 (decimal) values. Floats are most appropriate for scientific calculations. They're not appropriate for most business-oriented math, and inappropriate use of floats will bite you. Many decimal values can't be exactly represented in base-2. 0.1
can't, for instance, and so you see strange results like 1.0 - 0.1 = 0.8999999
.
Decimals store base-10 numbers. Decimal is an good type for most business math (but any built-in "money" type is more appropriate for financial calculations), where the range of values exceeds that provided by integer types, and fractional values are needed. Decimals, as the name implies, are designed for base-10 numbers - they can accurately store decimal values (again, to a certain point).
MySQL recently changed they way they store the DECIMAL type. In the past they stored the characters (or nybbles) for each digit comprising an ASCII (or nybble) representation of a number - vs - a two's complement integer, or some derivative thereof.
The current storage format for DECIMAL is a series of 1,2,3,or 4-byte integers whose bits are concatenated to create a two's complement number with an implied decimal point, defined by you, and stored in the DB schema when you declare the column and specify it's DECIMAL size and decimal point position.
By way of example, if you take a 32-bit int you can store any number from 0 - 4,294,967,295. That will only reliably cover 999,999,999, so if you threw out 2 bits and used (1<<30 -1) you'd give up nothing. Covering all 9-digit numbers with only 4 bytes is more efficient than covering 4 digits in 32 bits using 4 ASCII characters, or 8 nybble digits. (a nybble is 4-bits, allowing values 0-15, more than is needed for 0-9, but you can't eliminate that waste by going to 3 bits, because that only covers values 0-7)
The example used on the MySQL online docs uses DECIMAL(18,9) as an example. This is 9 digits ahead of and 9 digits behind the implied decimal point, which as explained above requires the following storage.
As 18 8-bit chars: 144 bits
As 18 4-bit nybbles: 72 bits
As 2 32-bit integers: 64 bits
Currently DECIMAL supports a max of 65 digits, as DECIMAL(M,D) where the largest value for M allowed is 65, and the largest value of D allowed is 30.
So as not to require chunks of 9 digits at a time, integers smaller than 32-bits are used to add digits using 1,2 and 3 byte integers. For some reason that defies logic, signed, instead of unsigned ints were used, and in so doing, 1 bit gets thrown out, resulting in the following storage capabilities. For 1,2 and 4 byte ints the lost bit doesn't matter, but for the 3-byte int it's a disaster because an entire digit is lost due to the loss of that single bit.
With an 7-bit int: 0 - 99
With a 15-bit int: 0 - 9,999
With a 23-bit int: 0 - 999,999 (0 - 9,999,999 with a 24-bit int)
1,2,3 and 4-byte integers are concatenated together to form a "bit pool" DECIMAL uses to represent the number precisely as a two's complement integer. The decimal point is NOT stored, it is implied.
This means that no ASCII to int conversions are required of the DB engine to convert the "number" into something the CPU recognizes as a number. No rounding, no conversion errors, it's a real number the CPU can manipulate.
Calculations on this arbitrarily large integer must be done in software, as there is no hardware support for this kind of number, but these libraries are very old and highly optimized, having been written 50 years ago to support IBM 370 Fortran arbitrary precision floating point data. They're still a lot slower than fixed-sized integer algebra done with CPU integer hardware, or floating point calculations done on the FPU.
In terms of storage efficiency, because the exponent of a float is attached to each and every float, specifying implicitly where the decimal point is, it is massively redundant, and therefore inefficient for DB work. In a DB you already know where the decimal point is to go up front, and every row in the table that has a value for a DECIMAL column need only look at the 1 & only specification of where that decimal point is to be placed, stored in the schema as the arguments to a DECIMAL(M,D) as the implication of the M and the D values.
The many remarks found here about which format is to be used for various kinds of applications are correct, so I won't belabor the point. I took the time to write this here because whoever is maintaining the linked MySQL online documentation doesn't understand any of the above and after rounds of increasingly frustrating attempts to explain it to them I gave up. A good indication of how poorly they understood what they were writing is the very muddled and almost indecipherable presentation of the subject matter.
As a final thought, if you have need of high-precision floating point computation, there've been tremendous advances in floating point code in the last 20 years, and hardware support for 96-bit and Quadruple Precision float are right around the corner, but there are good arbitrary precision libraries out there if manipulation of the stored value is important.