How do I calculate percentiles with python/numpy?
By the way, there is a pure-Python implementation of percentile function, in case one doesn't want to depend on scipy. The function is copied below:
## {{{ http://code.activestate.com/recipes/511478/ (r1)
import math
import functools
def percentile(N, percent, key=lambda x:x):
"""
Find the percentile of a list of values.
@parameter N - is a list of values. Note N MUST BE already sorted.
@parameter percent - a float value from 0.0 to 1.0.
@parameter key - optional key function to compute value from each element of N.
@return - the percentile of the values
"""
if not N:
return None
k = (len(N)-1) * percent
f = math.floor(k)
c = math.ceil(k)
if f == c:
return key(N[int(k)])
d0 = key(N[int(f)]) * (c-k)
d1 = key(N[int(c)]) * (k-f)
return d0+d1
# median is 50th percentile.
median = functools.partial(percentile, percent=0.5)
## end of http://code.activestate.com/recipes/511478/ }}}
You might be interested in the SciPy Stats package. It has the percentile function you're after and many other statistical goodies.
percentile()
is available in numpy
too.
import numpy as np
a = np.array([1,2,3,4,5])
p = np.percentile(a, 50) # return 50th percentile, e.g median.
print p
3.0
This ticket leads me to believe they won't be integrating percentile()
into numpy anytime soon.