create groups in column data in dataframe pandas code example
Example 1: group by pandas examples
>>> n_by_state = df.groupby("state")["state"].count()
>>> n_by_state.head(10)
state
AK 16
AL 206
AR 117
AS 2
AZ 48
CA 361
CO 90
CT 240
DC 2
DE 97
Name: last_name, dtype: int64
Example 2: Groups the DataFrame using the specified columns
df.groupBy().avg().collect()
sorted(df.groupBy('name').agg({'age': 'mean'}).collect())
sorted(df.groupBy(df.name).avg().collect())
sorted(df.groupBy(['name', df.age]).count().collect())