select and aggregate in pyspark on dataframe code example
Example 1: pyspark groupby sum
from pyspark.sql import functions as func
prova_df.groupBy("order_item_order_id").agg(func.sum("order_item_subtotal")).show()
Example 2: Pyspark Aggregation on multiple columns
df.groupBy("year", "sex").agg(avg("percent"), count("*"))