Combining multiple columns with same name in pandas dataframe
Try groupby
with axis=1
df.groupby(df.columns.values, axis=1).agg(lambda x: x.values.tolist()).sum().apply(pd.Series).T.sort_values('pp')
Out[320]:
b pp
0 0.001464 5.0
2 0.001459 5.0
1 0.001853 6.0
3 0.001843 6.0
A fun way with wide_to_long
s=pd.Series(df.columns)
df.columns=df.columns+s.groupby(s).cumcount().astype(str)
pd.wide_to_long(df.reset_index(),stubnames=['pp','b'],i='index',j='drop',suffix='\d+')
Out[342]:
pp b
index drop
0 0 5 0.001464
1 0 5 0.001459
0 1 6 0.001853
1 1 6 0.001843
This is possible using numpy
:
res = pd.DataFrame({'pp': df['pp'].values.T.ravel(),
'b': df['b'].values.T.ravel()})
print(res)
b pp
0 0.001464 5
1 0.001459 5
2 0.001853 6
3 0.001843 6
Or without referencing specific columns explicitly:
res = pd.DataFrame({i: df[i].values.T.ravel() for i in set(df.columns)})
Let's use melt, cumcount and unstack:
dm = df.melt()
dm.set_index(['variable',dm.groupby('variable').cumcount()])\
.sort_index()['value'].unstack(0)
Output:
variable b pp
0 0.001464 5.0
1 0.001459 5.0
2 0.001853 6.0
3 0.001843 6.0