pandas groupby multiple aggregation on different columns code example
Example 1: dataframe groupby multiple columns
grouped_multiple = df.groupby(['Team', 'Pos']).agg({'Age': ['mean', 'min', 'max']})
grouped_multiple.columns = ['age_mean', 'age_min', 'age_max']
grouped_multiple = grouped_multiple.reset_index()
print(grouped_multiple)
Example 2: Aggregate on the entire DataFrame without group
df.agg({"age": "max"}).collect()
from pyspark.sql import functions as F
df.agg(F.min(df.age)).collect()
Example 3: group by 2 columns pandas
In [11]: df.groupby(['col5', 'col2']).size()
Out[11]:
col5 col2
1 A 1
D 3
2 B 2
3 A 3
C 1
4 B 1
5 B 2
6 B 1
dtype: int64