Pandas: How do I assign values based on multiple conditions for existing columns?
You can do this using np.where
, the conditions use bitwise &
and |
for and
and or
with parentheses around the multiple conditions due to operator precedence. So where the condition is true 5
is returned and 0
otherwise:
In [29]:
df['points'] = np.where( ( (df['gender'] == 'male') & (df['pet1'] == df['pet2'] ) ) | ( (df['gender'] == 'female') & (df['pet1'].isin(['cat','dog'] ) ) ), 5, 0)
df
Out[29]:
gender pet1 pet2 points
0 male dog dog 5
1 male cat cat 5
2 male dog cat 0
3 female cat squirrel 5
4 female dog dog 5
5 female squirrel cat 0
6 squirrel dog cat 0
using apply.
def f(x):
if x['gender'] == 'male' and x['pet1'] == x['pet2']: return 5
elif x['gender'] == 'female' and (x['pet1'] == 'cat' or x['pet1'] == 'dog'): return 5
else: return 0
data['points'] = data.apply(f, axis=1)
numpy.select
This is a perfect case for np.select
where we can create a column based on multiple conditions and it's a readable method when there are more conditions:
conditions = [
df['gender'].eq('male') & df['pet1'].eq(df['pet2']),
df['gender'].eq('female') & df['pet1'].isin(['cat', 'dog'])
]
choices = [5,5]
df['points'] = np.select(conditions, choices, default=0)
print(df)
gender pet1 pet2 points
0 male dog dog 5
1 male cat cat 5
2 male dog cat 0
3 female cat squirrel 5
4 female dog dog 5
5 female squirrel cat 0
6 squirrel dog cat 0