Rename specific column(s) in pandas
data.rename(columns={'gdp':'log(gdp)'}, inplace=True)
The rename
show that it accepts a dict as a param for columns
so you just pass a dict with a single entry.
Also see related
A much faster implementation would be to use list-comprehension
if you need to rename a single column.
df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]
If the need arises to rename multiple columns, either use conditional expressions like:
df.columns = ['log(gdp)' if x=='gdp' else 'cap_mod' if x=='cap' else x for x in df.columns]
Or, construct a mapping using a dictionary
and perform the list-comprehension
with it's get
operation by setting default value as the old name:
col_dict = {'gdp': 'log(gdp)', 'cap': 'cap_mod'} ## key→old name, value→new name
df.columns = [col_dict.get(x, x) for x in df.columns]
Timings:
%%timeit
df.rename(columns={'gdp':'log(gdp)'}, inplace=True)
10000 loops, best of 3: 168 µs per loop
%%timeit
df.columns = ['log(gdp)' if x=='gdp' else x for x in df.columns]
10000 loops, best of 3: 58.5 µs per loop
How do I rename a specific column in pandas?
From v0.24+, to rename one (or more) columns at a time,
DataFrame.rename()
withaxis=1
oraxis='columns'
(theaxis
argument was introduced inv0.21
.Index.str.replace()
for string/regex based replacement.
If you need to rename ALL columns at once,
DataFrame.set_axis()
method withaxis=1
. Pass a list-like sequence. Options are available for in-place modification as well.
rename
with axis=1
df = pd.DataFrame('x', columns=['y', 'gdp', 'cap'], index=range(5))
df
y gdp cap
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
With 0.21+, you can now specify an axis
parameter with rename
:
df.rename({'gdp':'log(gdp)'}, axis=1)
# df.rename({'gdp':'log(gdp)'}, axis='columns')
y log(gdp) cap
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
(Note that rename
is not in-place by default, so you will need to assign the result back.)
This addition has been made to improve consistency with the rest of the API. The new axis
argument is analogous to the columns
parameter—they do the same thing.
df.rename(columns={'gdp': 'log(gdp)'})
y log(gdp) cap
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
rename
also accepts a callback that is called once for each column.
df.rename(lambda x: x[0], axis=1)
# df.rename(lambda x: x[0], axis='columns')
y g c
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
For this specific scenario, you would want to use
df.rename(lambda x: 'log(gdp)' if x == 'gdp' else x, axis=1)
Index.str.replace
Similar to replace
method of strings in python, pandas Index and Series (object dtype only) define a ("vectorized") str.replace
method for string and regex-based replacement.
df.columns = df.columns.str.replace('gdp', 'log(gdp)')
df
y log(gdp) cap
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
The advantage of this over the other methods is that str.replace
supports regex (enabled by default). See the docs for more information.
Passing a list to set_axis
with axis=1
Call set_axis
with a list of header(s). The list must be equal in length to the columns/index size. set_axis
mutates the original DataFrame by default, but you can specify inplace=False
to return a modified copy.
df.set_axis(['cap', 'log(gdp)', 'y'], axis=1, inplace=False)
# df.set_axis(['cap', 'log(gdp)', 'y'], axis='columns', inplace=False)
cap log(gdp) y
0 x x x
1 x x x
2 x x x
3 x x x
4 x x x
Note: In future releases, inplace
will default to True
.
Method Chaining
Why choose set_axis
when we already have an efficient way of assigning columns with df.columns = ...
? As shown by Ted Petrou in this answer set_axis
is useful when trying to chain methods.
Compare
# new for pandas 0.21+
df.some_method1()
.some_method2()
.set_axis()
.some_method3()
Versus
# old way
df1 = df.some_method1()
.some_method2()
df1.columns = columns
df1.some_method3()
The former is more natural and free flowing syntax.