how to unstack (or pivot?) in pandas
Using your df2
:
>>> df2.pivot_table(values='value', index=['DATE', 'variable'], columns="BORDER")
BORDER FRANCE GERMANY ITALY
DATE variable
2014-01-01 HOUR1 2 2 2
HOUR2 3 3 3
HOUR3 8 8 8
2014-01-02 HOUR1 4 4 4
HOUR2 5 5 5
HOUR3 12 12 12
2014-01-03 HOUR1 6 6 6
HOUR2 7 7 7
HOUR3 99 99 99
[9 rows x 3 columns]
There is still a bit of cleanup to do if you want to convert the index level "variable" into a column called "HOUR" and strip out the text "HOUR" from the values, but I think that is the basic format you want.
We want values (e.g. 'GERMANY'
) to become column names, and column names (e.g. 'HOUR1'
) to become values -- a swap of sorts.
The stack
method turns column names into index values, and
the unstack
method turns index values into column names.
So by shifting the values into the index, we can use stack
and unstack
to perform the swap.
import pandas as pd
datelisttemp = pd.date_range('1/1/2014', periods=3, freq='D')
s = list(datelisttemp)*3
s.sort()
df = pd.DataFrame({'BORDER':['GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY','GERMANY','FRANCE','ITALY' ], 'HOUR1':[2 ,2 ,2 ,4 ,4 ,4 ,6 ,6, 6],'HOUR2':[3 ,3 ,3, 5 ,5 ,5, 7, 7, 7], 'HOUR3':[8 ,8 ,8, 12 ,12 ,12, 99, 99, 99]}, index=s)
df = df.set_index(['BORDER'], append=True)
df.columns.name = 'HOUR'
df = df.unstack('BORDER')
df = df.stack('HOUR')
df = df.reset_index('HOUR')
df['HOUR'] = df['HOUR'].str.replace('HOUR', '').astype('int')
print(df)
yields
BORDER HOUR FRANCE GERMANY ITALY
2014-01-01 1 2 2 2
2014-01-01 2 3 3 3
2014-01-01 3 8 8 8
2014-01-02 1 4 4 4
2014-01-02 2 5 5 5
2014-01-02 3 12 12 12
2014-01-03 1 6 6 6
2014-01-03 2 7 7 7
2014-01-03 3 99 99 99