pandas drop value in a group if values are multiple
You can try:
# find the number of unique quantity for each thing
s = df.groupby('id')['quantity'].transform('nunique')
df[s.eq(1) # things with only 1 quantity value (either 0 or 1)
| df['quantity'].eq(1) # or quantity==1 when there are 2 values
]
Output:
id date quantity
2 thing 1 2016-09-01 1
3 thing 1 2016-10-01 1
4 thing 2 2017-01-01 1
5 thing 2 2017-02-01 1
6 thing 2 2017-02-11 1
7 thing 3 2017-09-01 0
8 thing 3 2017-10-01 0
Based on your logic, try transform
with max
, if max eq to original value we should keep,
#logic : only have 0 or 1 max will be 0 or 1 ,
# if both have 0 and 1, max should be 1 we should keep all value eq to 1
out = df[df.quantity.eq(df.groupby('id')['quantity'].transform('max'))]
Out[89]:
id date quantity
2 thing 1 2016-09-01 1
3 thing 1 2016-10-01 1
4 thing 2 2017-01-01 1
5 thing 2 2017-02-01 1
6 thing 2 2017-02-11 1
7 thing 3 2017-09-01 0
8 thing 3 2017-10-01 0