Fast method for removing duplicate columns in pandas.Dataframe
Perhaps you would be better off avoiding the problem altogether, by using pd.merge
instead of pd.concat
:
df_ab = pd.merge(df_a, df_b, how='inner')
This will merge df_a
and df_b
on all columns shared in common.
The easiest way is:
df = df.loc[:,~df.columns.duplicated()]
One line of code can change everything
You may use np.unique
to get indices of unique columns, and then use .iloc
:
>>> df
A A B B
0 5 5 10 10
1 6 6 19 19
>>> _, i = np.unique(df.columns, return_index=True)
>>> df.iloc[:, i]
A B
0 5 10
1 6 19