Converting multiple columns to categories in Pandas. apply?
Note that since pandas 0.23.0 you no longer apply
to convert multiple columns to categorical data types. Now you can simply do df[to_convert].astype('category')
instead (where to_convert
is a set of columns as defined in the question).
This was just fixed in master, and so will be in 0.17.0, see the issue here
In [7]: df = DataFrame({'A' : list('aabbcd'), 'B' : list('ffghhe')})
In [8]: df
Out[8]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [9]: df.dtypes
Out[9]:
A object
B object
dtype: object
In [10]: df.apply(lambda x: x.astype('category'))
Out[10]:
A B
0 a f
1 a f
2 b g
3 b h
4 c h
5 d e
In [11]: df.apply(lambda x: x.astype('category')).dtypes
Out[11]:
A category
B category
dtype: object