groupby transform pandas code example
Example 1: impute data by using groupby and transform
import pandas as pd
from datetime import datetime
def generate_data():
...
t = datetime.now()
df = generate_data()
df['value'] = df.groupby(['category', 'name'])['value']\
.transform(lambda x: x.fillna(x.mean()))
print(datetime.now()-t)
t = datetime.now()
df = generate_data()
df["value"] = df.groupby(['category', 'name'])\
.transform(lambda x: x.fillna(x.mean()))['value']
print(datetime.now()-t)
Example 2: Groups the DataFrame using the specified columns
df.groupBy().avg().collect()
sorted(df.groupBy('name').agg({'age': 'mean'}).collect())
sorted(df.groupBy(df.name).avg().collect())
sorted(df.groupBy(['name', df.age]).count().collect())
Example 3: impute data by using groupby and transform
df['value'] = df.groupby(['category', 'name'])['value']\
.transform(lambda x: x.fillna(x.mean()))