Using np.where but maintaining exisitng values if condition is False
There is a pandas.Series
method (where
incidentally) for exactly this kind of task. It seems a little backward at first, but from the documentation.
Series.where(cond, other=nan, inplace=False, axis=None, level=None, try_cast=False, raise_on_error=True)
Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other.
So, your example would become
cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
DF[col].where(~condition, np.nan, inplace=True)
But, if all you're trying to do is replace rows of all zeros for specific set of columns with NA
, you could do this instead
DF.loc[condition, cols] = NA
EDIT
To answer your original question, np.where
follows the same broadcasting rules as other array operations so you would replace ???
with DF[col]
, changing your example to:
cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
DF[col] = np.where(condition, NA, DF[col])
Proposed solutions work but for numpy array there is a simpler way without using DataFrame.
A solution would be :
np_array[np.where(condition)] = value_of_condition_true_rows