Calculate Arbitrary Percentile on Pandas GroupBy

You want the quantile method:

In [47]: df
Out[47]: 
           A         B    C
0   0.719391  0.091693  one
1   0.951499  0.837160  one
2   0.975212  0.224855  one
3   0.807620  0.031284  one
4   0.633190  0.342889  one
5   0.075102  0.899291  one
6   0.502843  0.773424  one
7   0.032285  0.242476  one
8   0.794938  0.607745  one
9   0.620387  0.574222  one
10  0.446639  0.549749  two
11  0.664324  0.134041  two
12  0.622217  0.505057  two
13  0.670338  0.990870  two
14  0.281431  0.016245  two
15  0.675756  0.185967  two
16  0.145147  0.045686  two
17  0.404413  0.191482  two
18  0.949130  0.943509  two
19  0.164642  0.157013  two

In [48]: df.groupby('C').quantile(.95)
Out[48]: 
            A         B
C                      
one  0.964541  0.871332
two  0.826112  0.969558

With pandas >= 0.25.0 you can also use Named aggregation

An example would be

import numpy
import pandas as pd
df = pd.DataFrame({'A': numpy.random.randint(1,3,size=100),'C': numpy.random.randn(100)})
df.groupby('A').agg(min_val = ('C','min'), percentile_80 = ('C',lambda x: x.quantile(0.8)))

I found another useful solution here

If I have to use groupby another approach can be:

def percentile(n):
    def percentile_(x):
        return np.percentile(x, n)
    percentile_.__name__ = 'percentile_%s' % n
    return percentile_

Using the below call, I am able to achieve the same result as the solution given by @TomAugspurger

df.groupby('C').agg([percentile(50), percentile(95)])

Calculate Arbitrary Percentile on Pandas GroupBy

Tags:

Pandas

Related

Recent Posts