Counting column values based on values in other columns for Pandas dataframes
pivot_table
sounds like what you want. A bit of a hack is to add a column of 1
's to use to count. This allows pivot_table
to add 1
for each occurrence of a particular x
-y
and Category
combination. You will set this new column as your value
parameter in pivot_table
and the aggfunc
paraemter to np.sum
. You'll probably want to set fill_value
to 0
as well:
df['count'] = 1
result = df.pivot_table(
index=['x', 'y'], columns='Category', values='count',
fill_value=0, aggfunc=np.sum
)
result
:
Category 1 2 3 4
x y
1 1 0 0 1 0
2 1 1 1 0 0
3 2 0 0 0 1
If you're interested in keeping x
and y
as columns and having the other column names as Category X
, you can rename the columns and use reset_index
:
result.columns = [f'Category {x}' for x in result.columns]
result = a.reset_index()