Panda rolling window percentile rank

In case you need the rank of the last observation only, as the case with rolling apply you can use:

 def pctrank(x):
    i = x.argsort().argmax() + 1
    n = len(x)
    return i/n

Time is about twice as fast

Your lambda receives a numpy array, which does not have a .rank method — it is pandas's Series and DataFrame that have it. You can thus change it to

pctrank = lambda x: pd.Series(x).rank(pct=True).iloc[-1]

Or you could use pure numpy along the lines of this SO answer:

def pctrank(x):
    n = len(x)
    temp = x.argsort()
    ranks = np.empty(n)
    ranks[temp] = (np.arange(n) + 1) / n
    return ranks[-1]

The easiest option would be to do something like this:

from scipy import stats
# 200 is the window size

dataset[name] =  dataset[name].rolling(200).apply(lambda x: stats.percentileofscore(x, x[-1]))

Panda rolling window percentile rank

Tags:

Python

Pandas

Apply

Rank

Percentile

Related

Recent Posts