Change data type of a specific column of a pandas dataframe
df['colname'] = df['colname'].astype(int)
works when changing from float
values to int
atleast.
I have tried following:
df['column']=df.column.astype('int64')
and it worked for me.
You can use reindex
by sorted column by sort_values
, cast to int
by astype
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'colname':['7','3','9'],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B D E F colname
0 1 4 1 5 7 7
1 2 5 3 3 4 3
2 3 6 5 6 3 9
print (df.colname.astype(int).sort_values())
1 3
0 7
2 9
Name: colname, dtype: int32
print (df.reindex(df.colname.astype(int).sort_values().index))
A B D E F colname
1 2 5 3 3 4 3
0 1 4 1 5 7 7
2 3 6 5 6 3 9
print (df.reindex(df.colname.astype(int).sort_values().index).reset_index(drop=True))
A B D E F colname
0 2 5 3 3 4 3
1 1 4 1 5 7 7
2 3 6 5 6 3 9
If first solution does not works because None
or bad data use to_numeric
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'colname':['7','3','None'],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B D E F colname
0 1 4 1 5 7 7
1 2 5 3 3 4 3
2 3 6 5 6 3 None
print (pd.to_numeric(df.colname, errors='coerce').sort_values())
1 3.0
0 7.0
2 NaN
Name: colname, dtype: float64