Replace invalid values with None in Pandas DataFrame
I prefer the solution using replace
with a dict
because of its simplicity and elegance:
df.replace({'-': None})
You can also have more replacements:
df.replace({'-': None, 'None': None})
And even for larger replacements, it is always obvious and clear what is replaced by what - which is way harder for long lists, in my opinion.
Actually in later versions of pandas this will give a TypeError:
df.replace('-', None)
TypeError: If "to_replace" and "value" are both None then regex must be a mapping
You can do it by passing either a list or a dictionary:
In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None})
Out[11]:
0
0 None
1 3
2 2
3 5
4 1
5 -5
6 -1
7 None
8 9
But I recommend using NaNs rather than None:
In [12]: df.replace('-', np.nan)
Out[12]:
0
0 NaN
1 3
2 2
3 5
4 1
5 -5
6 -1
7 NaN
8 9