pandas group by aggregate code example

Example 1: pandas groupby aggregate quantile

# 50th Percentile
def q50(x):
    return x.quantile(0.5)

# 90th Percentile
def q90(x):
    return x.quantile(0.9)

my_DataFrame.groupby(['AGGREGATE']).agg({'MY_COLUMN': [q50, q90, 'max']})

Example 2: two groupby pandas

In [8]: grouped = df.groupby('A')

In [9]: grouped = df.groupby(['A', 'B'])

Example 3: Aggregate on the entire DataFrame without group

# Aggregate on the entire DataFrame without group

df.agg({"age": "max"}).collect()
# [Row(max(age)=5)]
from pyspark.sql import functions as F
df.agg(F.min(df.age)).collect()
# [Row(min(age)=2)]