read_csv using dtypes but there is na value in columns
clean_pdf_type=pd.read_csv('table_updated.csv',usecols=col_names)
clean_pdf_type = (clean_pdf_type.fillna(0)).astype(col_types)
As said in the comments, don't specify the type, remove the NA and then cast to a certain type
Pandas v0.24+
See NumPy or Pandas: Keeping array type as integer while having a NaN value
Pandas pre-v0.24
You cannot have NaN
values in an int
dtype series. This is non-avoidable, because NaN
values are considered float
:
import numpy as np
type(np.nan) # float
Your best bet is to read in these columns as float
instead. If you are then able to replace NaN
values by a filler value such as 0
or -1
, you can process accordingly and convert to int
:
int_cols = ['col1', 'col2', 'col3']
df[int_cols] = df[int_cols].fillna(-1)
df[int_cols] = df[int_cols].apply(pd.to_numeric, downcast='integer')
The alternative of having mixed int
and float
values will result in a series of dtype object
. It is not recommended.