pyspark group by and agg with multiple columns code example Example: Pyspark Aggregation on multiple columns df.groupBy("year", "sex").agg(avg("percent"), count("*"))