Pandas fillna throws ValueError: fill value must be in categories
Use Series.cat.add_categories
for add categories first:
AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
AM_train['product_category_2'].fillna('Unknown', inplace =True)
AM_train['city_development_index'] = AM_train['city_development_index'].cat.add_categories('Missing')
AM_train['city_development_index'].fillna('Missing', inplace =True)
Sample:
AM_train = pd.DataFrame({'product_category_2': pd.Categorical(['a','b',np.nan])})
AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
AM_train['product_category_2'].fillna('Unknown', inplace =True)
print (AM_train)
product_category_2
0 a
1 b
2 Unknown
I was getting the same error in a data frame while trying to get rid of all the NaNs.
I did not look too much into it, but substituting .fillna()
for .replace(np.nan, value)
did the trick.
Use with caution, since I am not sure np.nan
catches all the values that are interpreted as NaN