How to assign unique values to groups of rows in a pandas dataframe based on a condition?

You can use cumsum and map to letters with chr:

m = df['A'].eq(0)
df['B'] = m.cumsum().add(65).map(chr).mask(m, '-')
df

   A  B
0  3  A
1  5  A
2  0  B
3  2  B
4  6  B
5  9  B
6  0  C
7  3  C
8  4  C

A NumPy solution can be written from this using views, and should be quite fast:

m = np.cumsum(df['A'].values == 0)
# thanks to @user3483203 for the neat trick! 
df['B'] = (m + 65).view('U2')
df

   A  B
0  3  A
1  5  A
2  0  B
3  2  B
4  6  B
5  9  B
6  0  C
7  3  C
8  4  C

From v0.22, you can also do this through pandas Series.view:

m = df['A'].eq(0)
df['B'] = (m.cumsum()+65).view('U2').mask(m, '-')
df

   A  B
0  3  A
1  5  A
2  0  -
3  2  B
4  6  B
5  9  B
6  0  -
7  3  C
8  4  C

Here's one way using np.where. I'm using numerical labeling here, which might be more appropiate in the case there are many groups:

import numpy as np

m = df.eq(0)
df['A'] = np.where(m, '-', m.cumsum())

   A
0  0
1  0
2  - 
3  1
4  1
5  1
6  - 
7  2
8  2

IIUC

import string
s=df.A.eq(0).cumsum()
d=dict(zip(s.unique(),string.ascii_uppercase[:s.max()+1]))
s.loc[df.A!=0].map(d).reindex(df.index,fill_value='-')
Out[360]: 
0    A
1    A
2    -
3    B
4    B
5    B
6    -
7    C
8    C
Name: A, dtype: object

How to assign unique values to groups of rows in a pandas dataframe based on a condition?

Tags:

Python

Pandas

Dataframe

Related

Recent Posts