How to replace a value in pandas, with NaN?

df=df.replace({'?':np.NaN})

Using Dictionary to replace any value by NaN


You can replace this just for that column using replace:

df['workclass'].replace('?', np.NaN)

or for the whole df:

df.replace('?', np.NaN)

UPDATE

OK I figured out your problem, by default if you don't pass a separator character then read_csv will use commas ',' as the separator.

Your data and in particular one example where you have a problematic line:

54, ?, 180211, Some-college, 10, Married-civ-spouse, ?, Husband, Asian-Pac-Islander, Male, 0, 0, 60, South, >50K

has in fact a comma and a space as the separator so when you passed the na_value=['?'] this didn't match because all your values have a space character in front of them all which you can't observe.

if you change your line to this:

rawfile = pd.read_csv(filename, header=None, names=DataLabels, sep=',\s', na_values=["?"])

then you should find that it all works:

27      54               NaN  180211  Some-college             10 

Use numpy.nan

Numpy - Replace a number with NaN

import numpy as np
df.applymap(lambda x: np.nan if x == '?' else x)

okay I got it by :

 #========trying to replace ?
    newraw= rawfile.replace('[?]', np.nan, regex=True)
    print newraw[25:40]

Tags:

Python

Pandas