Merge pandas DataFrame columns starting with the same letters
Use dictionary comprehension :
df = pd.DataFrame({i: pd.Series(x.to_numpy().ravel())
for i, x in df.groupby(lambda x: x[0], axis=1)})
print (df)
a b c
0 1 5 9.0
1 3 7 0.0
2 2 6 NaN
3 4 8 NaN
I'd recommend melt
, followed by pivot
. To resolve duplicates, you'll need to pivot on a cumcounted column.
u = df.melt()
u['variable'] = u['variable'].str[0] # extract the first letter
u.assign(count=u.groupby('variable').cumcount()).pivot('count', 'variable', 'value')
variable a b c
count
0 1.0 5.0 9.0
1 2.0 6.0 0.0
2 3.0 7.0 NaN
3 4.0 8.0 NaN
This can be re-written as,
u = df.melt()
u['variable'] = [x[0] for x in u['variable']]
u.insert(0, 'count', u.groupby('variable').cumcount())
u.pivot(*u)
variable a b c
count
0 1.0 5.0 9.0
1 2.0 6.0 0.0
2 3.0 7.0 NaN
3 4.0 8.0 NaN
If performance matters, here's an alternative with pd.concat
:
from operator import itemgetter
pd.concat({
k: pd.Series(g.values.ravel())
for k, g in df.groupby(operator.itemgetter(0), axis=1)
}, axis=1)
a b c
0 1 5 9.0
1 3 7 0.0
2 2 6 NaN
3 4 8 NaN