best way to one hot encode data in python for machine learning code example
Example: python convert categorical data to one-hot encoding
# Basic syntax:
df_onehot = pd.get_dummies(df, columns=['col_name'], prefix=['one_hot'])
# Where:
# - get_dummies creates a one-hot encoding for each unique categorical
# value in the column named col_name
# - The prefix is added at the beginning of each categorical value
# to create new column names for the one-hot columns
# Example usage:
# Build example dataframe:
df = pd.DataFrame(['sunny', 'rainy', 'cloudy'], columns=['weather'])
print(df)
weather
0 sunny
1 rainy
2 cloudy
# Convert categorical weather variable to one-hot encoding:
df_onehot = pd.get_dummies(df, columns=['weather'], prefix=['one_hot'])
print(df_onehot)
one_hot_cloudy one_hot_rainy one_hot_sunny
0 0 0 1
1 0 1 0
2 1 0 0