Creating a new column assigning same index to repeated values in Pandas DataFrame
Check withfactorize
df['#']=df.id.factorize()[0]+1
df
id color #
0 123 white 1
1 123 white 1
2 123 white 1
3 345 blue 2
4 345 blue 2
5 678 red 3
Another method
df.groupby('id').ngroup()+1
0 1
1 1
2 1
3 2
4 2
5 3
dtype: int64
To add it to the first positon:
df.insert(loc=0, column='#', value=df.id.factorize()[0]+1)
df
# id color
0 1 123 white
1 1 123 white
2 1 123 white
3 2 345 blue
4 2 345 blue
5 3 678 red
You can also use categorical codes:
df['id'].astype('category').cat.codes
Output:
0 0
1 0
2 0
3 1
4 1
5 2
dtype: int8