Renaming columns in a Pandas dataframe with duplicate column names?
Here is another dynamic solution that I think is nicer
In [59]: df
Out[59]:
a x x x z
0 6 2 7 7 8
1 6 6 3 1 1
2 6 6 7 5 6
3 8 3 6 1 8
4 5 7 5 3 0
In [61]: class renamer():
def __init__(self):
self.d = dict()
def __call__(self, x):
if x not in self.d:
self.d[x] = 0
return x
else:
self.d[x] += 1
return "%s_%d" % (x, self.d[x])
df.rename(columns=renamer())
Out[61]:
a x x_1 x_2 z
0 6 2 7 7 8
1 6 6 3 1 1
2 6 6 7 5 6
3 8 3 6 1 8
4 5 7 5 3 0
X_R.columns = ['Retail','Cost']
Here is a dynamic solution:
In [59]: df
Out[59]:
a x x x z
0 6 2 7 7 8
1 6 6 3 1 1
2 6 6 7 5 6
3 8 3 6 1 8
4 5 7 5 3 0
In [60]: d
Out[60]: {'x': ['x1', 'x2', 'x3']}
In [61]: df.rename(columns=lambda c: d[c].pop(0) if c in d.keys() else c)
Out[61]:
a x1 x2 x3 z
0 6 2 7 7 8
1 6 6 3 1 1
2 6 6 7 5 6
3 8 3 6 1 8
4 5 7 5 3 0
Not directly an answer, but since this a top search result, here is a short and flexible solution to append a suffix to duplicate column names:
# A dataframe with duplicated column names
df = pd.DataFrame([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
df.columns = ['a', 'b', 'b']
# Columns to not rename
excluded = df.columns[~df.columns.duplicated(keep=False)]
# An incrementer
import itertools
inc = itertools.count().__next__
# A renamer
def ren(name):
return f"{name}{inc()}" if name not in excluded else name
# Use inside rename()
df.rename(columns=ren)
a b b a b0 b1
0 1 2 3 0 1 2 3
1 4 5 6 => 1 4 5 6
2 7 8 8 2 7 8 9