Convert float to string in positional format (without scientific notation and false precision)
Unfortunately it seems that not even the new-style formatting with float.__format__
supports this. The default formatting of float
s is the same as with repr
; and with f
flag there are 6 fractional digits by default:
>>> format(0.0000000005, 'f')
'0.000000'
However there is a hack to get the desired result - not the fastest one, but relatively simple:
- first the float is converted to a string using
str()
orrepr()
- then a new
Decimal
instance is created from that string. Decimal.__format__
supportsf
flag which gives the desired result, and, unlikefloat
s it prints the actual precision instead of default precision.
Thus we can make a simple utility function float_to_str
:
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context
but it would be slower, creating a new thread-local context and a context manager for each conversion.
This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:
>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'
The last result is rounded at the last digit
As @Karin noted, float_to_str(420000000000000000.0)
does not strictly match the format expected; it returns 420000000000000000
without trailing .0
.
If you are satisfied with the precision in scientific notation, then could we just take a simple string manipulation approach? Maybe it's not terribly clever, but it seems to work (passes all of the use cases you've presented), and I think it's fairly understandable:
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
n = 0.000000054321654321
assert(float_to_str(n) == '0.000000054321654321')
n = 0.00000005
assert(float_to_str(n) == '0.00000005')
n = 420000000000000000.0
assert(float_to_str(n) == '420000000000000000.0')
n = 4.5678e-5
assert(float_to_str(n) == '0.000045678')
n = 1.1
assert(float_to_str(n) == '1.1')
n = -4.5678e-5
assert(float_to_str(n) == '-0.000045678')
Performance:
I was worried this approach may be too slow, so I ran timeit
and compared with the OP's solution of decimal contexts. It appears the string manipulation is actually quite a bit faster. Edit: It appears to only be much faster in Python 2. In Python 3, the results were similar, but with the decimal approach slightly faster.
Result:
Python 2: using
ctx.create_decimal()
:2.43655490875
Python 2: using string manipulation:
0.305557966232
Python 3: using
ctx.create_decimal()
:0.19519368198234588
Python 3: using string manipulation:
0.2661344590014778
Here is the timing code:
from timeit import timeit
CODE_TO_TIME = '''
float_to_str(0.000000054321654321)
float_to_str(0.00000005)
float_to_str(420000000000000000.0)
float_to_str(4.5678e-5)
float_to_str(1.1)
float_to_str(-0.000045678)
'''
SETUP_1 = '''
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
'''
SETUP_2 = '''
def float_to_str(f):
float_string = repr(f)
if 'e' in float_string: # detect scientific notation
digits, exp = float_string.split('e')
digits = digits.replace('.', '').replace('-', '')
exp = int(exp)
zero_padding = '0' * (abs(int(exp)) - 1) # minus 1 for decimal point in the sci notation
sign = '-' if f < 0 else ''
if exp > 0:
float_string = '{}{}{}.0'.format(sign, digits, zero_padding)
else:
float_string = '{}0.{}{}'.format(sign, zero_padding, digits)
return float_string
'''
print(timeit(CODE_TO_TIME, setup=SETUP_1, number=10000))
print(timeit(CODE_TO_TIME, setup=SETUP_2, number=10000))
As of NumPy 1.14.0, you can just use numpy.format_float_positional
. For example, running against the inputs from your question:
>>> numpy.format_float_positional(0.000000054321654321)
'0.000000054321654321'
>>> numpy.format_float_positional(0.00000005)
'0.00000005'
>>> numpy.format_float_positional(0.1)
'0.1'
>>> numpy.format_float_positional(4.5678e-20)
'0.000000000000000000045678'
numpy.format_float_positional
uses the Dragon4 algorithm to produce the shortest decimal representation in positional format that round-trips back to the original float input. There's also numpy.format_float_scientific
for scientific notation, and both functions offer optional arguments to customize things like rounding and trimming of zeros.