Replacing values greater than a number in pandas dataframe
Very simply : df[df > 9] = 11
You can use apply
with list comprehension
:
df1['A'] = df1['A'].apply(lambda x: [y if y <= 9 else 11 for y in x])
print (df1)
A
2017-01-01 02:00:00 [11, 11, 11]
2017-01-01 03:00:00 [3, 11, 9]
Faster solution is first convert to numpy array
and then use numpy.where
:
a = np.array(df1['A'].values.tolist())
print (a)
[[33 34 39]
[ 3 43 9]]
df1['A'] = np.where(a > 9, 11, a).tolist()
print (df1)
A
2017-01-01 02:00:00 [11, 11, 11]
2017-01-01 03:00:00 [3, 11, 9]
You can use numpy indexing, accessed through the .values
function.
df['col'].values[df['col'].values > x] = y
where you are replacing any value greater than x with the value of y.
So for the example in the question:
df1['A'].values[df1['A'] > 9] = 11
I know this is an old post, but pandas now supports DataFrame.where
directly. In your example:
df.where(df <= 9, 11, inplace=True)
Please note that pandas' where
is different than numpy.where
. In pandas, when the condition == True
, the current value in the dataframe is used. When condition == False
, the other value is taken.
EDIT:
You can achieve the same for just a column with Series.where
:
df['A'].where(df['A'] <= 9, 11, inplace=True)