How to move pandas data from index to column after multiple groupby
If you have the MultiIndex
and want to reset only a specific index level you can use the parameter level
in reset_index
. For example:
index = pd.MultiIndex.from_tuples([('one', 'a'), ('one', 'b'), ('two', 'a'), ('two', 'b')], names=['A', 'B'])
s = pd.DataFrame(np.arange(1.0, 5.0), index=index, columns=['C'])
C
A B
one a 1.0
b 2.0
two a 3.0
b 4.0
Reset the first level:
df.reset_index(level=0)
Output:
A C
B
a one 1.0
b one 2.0
a two 3.0
b two 4.0
Reset the second level:
df.reset_index(level=1)
Output:
B C
A
one a 1.0
one b 2.0
two a 3.0
two b 4.0
I defer form the accepted answer.
While there are 2 ways to do this, these will not necessarily result in same output. Specially when you are using Grouper
in groupby
index=False
reset_index()
example df
+---------+---------+-------------+------------+
| column1 | column2 | column_date | column_sum |
+---------+---------+-------------+------------+
| A | M | 26-10-2018 | 2 |
| B | M | 28-10-2018 | 3 |
| A | M | 30-10-2018 | 6 |
| B | M | 01-11-2018 | 3 |
| C | N | 03-11-2018 | 4 |
+---------+---------+-------------+------------+
They do not work the same way.
df = df.groupby(
by=[
'column1',
'column2',
pd.Grouper(key='column_date', freq='M')
],
as_index=False
).sum()
The above will give
+---------+---------+------------+
| column1 | column2 | column_sum |
+---------+---------+------------+
| A | M | 8 |
| B | M | 3 |
| B | M | 3 |
| C | N | 4 |
+---------+---------+------------+
While,
df = df.groupby(
by=[
'column1',
'column2',
pd.Grouper(key='column_date', freq='M')
]
).sum().reset_index()
Will give
+---------+---------+-------------+------------+
| column1 | column2 | column_date | column_sum |
+---------+---------+-------------+------------+
| A | M | 31-10-2018 | 8 |
| B | M | 31-10-2018 | 3 |
| B | M | 30-11-2018 | 3 |
| C | N | 30-11-2018 | 4 |
+---------+---------+-------------+------------+
Method #1: reset_index()
>>> g
uses books
sum sum
token year
xanthos 1830 3 3
1840 3 3
1868 2 2
1875 1 1
[4 rows x 2 columns]
>>> g = g.reset_index()
>>> g
token year uses books
sum sum
0 xanthos 1830 3 3
1 xanthos 1840 3 3
2 xanthos 1868 2 2
3 xanthos 1875 1 1
[4 rows x 4 columns]
Method #2: don't make the index in the first place, using as_index=False
>>> g = dfalph[['token', 'year', 'uses', 'books']].groupby(['token', 'year'], as_index=False).sum()
>>> g
token year uses books
0 xanthos 1830 3 3
1 xanthos 1840 3 3
2 xanthos 1868 2 2
3 xanthos 1875 1 1
[4 rows x 4 columns]
You need to add drop=True
:
df.reset_index(drop=True)
df = df.groupby(
by=[
'column1',
'column2',
pd.Grouper(key='column_date', freq='M')
]
).sum().reset_index(drop=True)