pyspark groupby multiple columns code example
Example 1: dataframe groupby multiple columns
grouped_multiple = df.groupby(['Team', 'Pos']).agg({'Age': ['mean', 'min', 'max']})
grouped_multiple.columns = ['age_mean', 'age_min', 'age_max']
grouped_multiple = grouped_multiple.reset_index()
print(grouped_multiple)
Example 2: pyspark group by and average in dataframes
df.groupBy("Profession").agg({'Age':'avg', 'Gender':'count'}).show()