reading file with missing values in python pandas
What version of pandas are you on? Interpreting empty string as NaN is the default behavior for pandas and seem to parse the empty strings fine in your data snippet both in v0.7.3 and current master without using the na_values
parameter at all.
In [10]: data = """\
10/08/2012,12:10:10,name1,0.81,4.02,50;18.5701400N,4;07.7693770E,7.92,10.50,0.0106,4.30,0.0301
10/08/2012,12:10:11,name2,,,,,10.87,1.40,0.0099,9.70,0.0686
"""
In [11]: read_csv(StringIO(data), header=None).T
Out[11]:
0 1
X.1 10/08/2012 10/08/2012
X.2 12:10:10 12:10:11
X.3 name1 name2
X.4 0.81 NaN
X.5 4.02 NaN
X.6 50;18.5701400N NaN
X.7 4;07.7693770E NaN
X.8 7.92 10.87
X.9 10.5 1.4
X.10 0.0106 0.0099
X.11 4.3 9.7
X.12 0.0301 0.0686
The parameter na_values
must be "list like" (see this answer).
A string is "list like" so:
na_values='abc' # would transform the letters 'a', 'b' and 'c' each into `nan`
# is equivalent to
na_values=['a','b','c']
Similarly:
na_values=''
# is equivalent to
na_values=[] # and this is not what you want!
This means that you need to use na_values=['']
.