Pandas: Multilevel column names
Try this:
df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})
columns=[('c','a'),('c','b')]
df.columns=pd.MultiIndex.from_tuples(columns)
A lot of these solutions seem just a bit more complex than they need to be.
I prefer to make things look as simple and intuitive as possible when speed isn't absolutely necessary. I think this solution accomplishes that.
Tested in versions of pandas as early as 0.22.0
.
Simply create a DataFrame (ignore columns in the first step) and then set colums equal to your n-dim list of column names.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2]])
In [3]: df
Out[3]:
0 1 2 3
0 1 1 1 1
1 2 2 2 2
In [4]: df.columns = [['a', 'c', 'e', 'g'], ['b', 'd', 'f', 'h']]
In [5]: df
Out[5]:
a c e g
b d f h
0 1 1 1 1
1 2 2 2 2
You can use concat
. Give it a dictionary of dataframes where the key is the new column level you want to add.
In [46]: d = {}
In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
data=[[10, 0.89, 0.98, 0.31],
[20, 0.34, 0.78, 0.34]]).set_index('idx')
In [48]: pd.concat(d, axis=1)
Out[48]:
first_level
a b c
idx
10 0.89 0.98 0.31
20 0.34 0.78 0.34
You can use the same technique to create multiple levels.
In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],
data=[[10, 0.29, 0.63, 0.99],
[20, 0.23, 0.26, 0.98]]).set_index('idx')
In [50]: pd.concat(d, axis=1)
Out[50]:
first_level second_level
a b c a b c
idx
10 0.89 0.98 0.31 0.29 0.63 0.99
20 0.34 0.78 0.34 0.23 0.26 0.98
No need to create a list of tuples
Use: pd.MultiIndex.from_product(iterables)
import pandas as pd
import numpy as np
df = pd.Series(np.random.rand(3), index=["a","b","c"]).to_frame().T
df.columns = pd.MultiIndex.from_product([["new_label"], df.columns])
Resultant DataFrame:
new_label
a b c
0 0.25999 0.337535 0.333568
Pull request from Jan 25, 2014