Return aggregate for all unique in a group

continuing from where u stopped, a combo of stack and unstack will give you ur required output:

res = (df.groupby(['month','cat'])
       .sales
       .sum()
       #unstack and fill value for the null column
       .unstack(fill_value=0)
       #return to groupby form and reset
       .stack()
       .reset_index(name='sales')
      )

res

  month cat sales
0   dec a   11
1   dec b   2
2   feb a   12
3   feb b   4
4   jan a   0
5   jan b   5
6   mar a   0
7   mar b   4

Use MultiIndex with reindex as:

df=(
    df.groupby(['month','cat']).sales.sum()
    .reindex(pd.MultiIndex.from_product([df.month.unique(), df.cat.unique()], 
                                   names=['month', 'cat']), fill_value=0)
    .reset_index()
)

print(df)
  month cat  sales
0   dec   a     11
1   dec   b      2
2   feb   a     12
3   feb   b      4
4   jan   a      0
5   jan   b      5
6   mar   a      0
7   mar   b      4

Another way without groupby but with pivot_table and stack:

df_ = df.pivot_table(index='month',columns='cat', 
                     values='sales', aggfunc=sum, fill_value=0)\
        .stack().reset_index()
print (df_)
  month cat   0
0   dec   a  11
1   dec   b   2
2   feb   a  12
3   feb   b   4
4   jan   a   0
5   jan   b   5
6   mar   a   0
7   mar   b   4

Return aggregate for all unique in a group

Tags:

Python

Pandas

Aggregation

Pandas Groupby

Related

Recent Posts