in Pandas, when using read_csv(), how to assign a NaN to a value that's not the dtype intended?
I tried creating a csv to replicate this feedback but couldn't on pandas 0.18, so I can only recommend two methods to handle this:
First
If you know that your missing values are all marked by a string 'none', then do this:
moto = pd.read_csv("test.csv", na_values=['none'])
You can also add, to the na_values list, other markers that should be converted to NaNs.
Second
Try your first line again without using the dtype option.
moto = pd.read_csv('reporte.csv')
The read is successful because you are only getting a warning. Now execute moto.dtypes
to show you which columns are objects. For the ones you want to change do the following:
moto.test_column = pd.to_numeric(moto.test_column, errors='coerce')
The 'coerce' option will convert any problematic entries, like 'none', to NaNs.
To convert the entire dataframe at once, you can use convert_objects. You could also use it on a single column, but that usage is deprecated in favor of to_numeric. The option, convert_numeric, does the coercion to NaNs:
moto = moto.convert_objects(convert_numeric=True)
After any of these methods, proceed with fillna to do what you need to.