Return aggregate for all unique in a group
continuing from where u stopped, a combo of stack and unstack will give you ur required output:
res = (df.groupby(['month','cat'])
.sales
.sum()
#unstack and fill value for the null column
.unstack(fill_value=0)
#return to groupby form and reset
.stack()
.reset_index(name='sales')
)
res
month cat sales
0 dec a 11
1 dec b 2
2 feb a 12
3 feb b 4
4 jan a 0
5 jan b 5
6 mar a 0
7 mar b 4
Use MultiIndex
with reindex
as:
df=(
df.groupby(['month','cat']).sales.sum()
.reindex(pd.MultiIndex.from_product([df.month.unique(), df.cat.unique()],
names=['month', 'cat']), fill_value=0)
.reset_index()
)
print(df)
month cat sales
0 dec a 11
1 dec b 2
2 feb a 12
3 feb b 4
4 jan a 0
5 jan b 5
6 mar a 0
7 mar b 4
Another way without groupby
but with pivot_table
and stack
:
df_ = df.pivot_table(index='month',columns='cat',
values='sales', aggfunc=sum, fill_value=0)\
.stack().reset_index()
print (df_)
month cat 0
0 dec a 11
1 dec b 2
2 feb a 12
3 feb b 4
4 jan a 0
5 jan b 5
6 mar a 0
7 mar b 4