logit and inverse logit functions for extreme values
Either use
1. The bigfloat package with supports arbitrary precision floating point operations.
2. The SymPy symbolic math package. I'll give examples of both:
First, bigfloat:
http://packages.python.org/bigfloat/
Here's a simple example:
from bigfloat import *
def logit(p):
with precision(100000):
return log(p)- log(1 -BigFloat(p))
def inv_logit(p):
with precision(100000):
return exp(p) / (1 + exp(p))
int(round(logit(inv_logit(12422.0))))
# gives 12422
int(round(logit(inv_logit(-12422.0))))
# gives -12422
This is really slow. You may want to consider restructuring your problem and do some parts analytically. Cases like these are rare in real problems - I'm curious about what kind of problem you are working on.
Example installation:
wget http://pypi.python.org/packages/source/b/bigfloat/bigfloat-0.3.0a2.tar.gz
tar xvzf bigfloat-0.3.0a2.tar.gz
cd bigfloat-0.3.0a2
as root:
python setup.py install
About the reason your functions wore better with negative values. Consider:
>>> float(inv_logit(-15))
3.059022269256247e-07
>>> float(inv_logit(15))
0.9999996940977731
In the first case floating point numbers represent this value easily. The decimal point is moved so that the leading zeroes: 0.0000... does not need to be stored. In the second case all the leading 0.999 needs to be stored, so you need all that extra precision to get an exact result when later doing 1-p in logit().
Here's the symbolic math way (significantly faster!):
from sympy import *
def inv_logit(p):
return exp(p) / (1 + exp(p))
def logit(p):
return log(p)- log(1 -p)
x=Symbol('x')
expr=logit(inv_logit(x))
# expr is now:
# -log(1 - exp(x)/(1 + exp(x))) + log(exp(x)/(1 + exp(x)))
# rewrite it: (there are many other ways to do this. read the doc)
# you may want to make an expansion (of some suitable kind) instead.
expr=cancel(powsimp(expr)).expand()
# it is now 'x'
# just evaluate any expression like this:
result=expr.subs(x,123.231)
# result is now an equation containing: 123.231
# to get the float:
result.evalf()
Sympy is found here http://docs.sympy.org/. In ubuntu it's found via synaptic.
You're running up against the precision limits for a IEEE 754 double-precision float. You'll need to use higher-precision numbers and operations if you want a larger range and a more precise domain.
>>> 1 + np.exp(-37)
1.0
>>> 1 + decimal.Decimal(-37).exp()
Decimal('1.000000000000000085330476257')
There is a way to implement the functions so that they are stable in a wide range of values but it involves a distinction of cases depending on the argument.
Take for example the inv_logit function. Your formula "np.exp(p) / (1 + np.exp(p))" is correct but will overflow for big p. If you divide numerator and denominator by np.exp(p) you obtain the equivalent expression
1. / (1. + np.exp(-p))
The difference being that this one will not overflow for big positive p. It will overflow however for big negative values of p. Thus, a stable implementation could be as follows:
def inv_logit(p):
if p > 0:
return 1. / (1. + np.exp(-p))
elif p <= 0:
np.exp(p) / (1 + np.exp(p))
else:
raise ValueError
This is the strategy used in the library LIBLINEAR (and possibly others).
Nowadays, scipy has logit and expit (inverse logit) functions, eg
>>> from scipy.special import logit, expit
>>> import numpy as np
>>> logit([0, 0.25, 0.5, 0.75, 1])
array([ -inf, -1.09861229, 0. , 1.09861229, inf])
>>> expit([-np.inf, -1.5, 0, 1.5, np.inf])
array([ 0. , 0.18242552, 0.5 , 0.81757448, 1. ])