Merge multiple DataFrames Pandas
I would use append.
>>> df1.append(df2).append(df3).sort_values('depth')
VAR1 VAR2 VAR3 depth profile
0 38.196202 NaN NaN 0.5 profile_1
1 38.198002 NaN NaN 0.6 profile_1
0 NaN 0.20440 NaN 0.6 profile_1
1 NaN 0.20442 NaN 1.1 profile_1
2 NaN 0.20446 NaN 1.2 profile_1
0 NaN NaN 15.188 1.2 profile_1
2 38.200001 NaN NaN 1.3 profile_1
1 NaN NaN 15.182 1.3 profile_1
2 NaN NaN 15.182 1.4 profile_1
Obviously if you have a lot of dataframes, just make a list and loop through them.
A simple way is with a combination of functools.partial
/reduce
.
Firstly partial
allows to "freeze" some portion of a function’s arguments and/or keywords resulting in a new object with a simplified signature. Then with reduce
we can apply cumulatively the new partial object to the items of iterable (list of dataframes here):
from functools import partial, reduce
dfs = [df1, df2, df3]
merge = partial(pd.merge, on=['depth', 'profile'], how='outer')
reduce(merge, dfs)
depth VAR1 profile VAR2 VAR3
0 0.6 38.198002 profile_1 0.20440 NaN
1 0.6 38.198002 profile_1 0.20440 NaN
2 1.3 38.200001 profile_1 NaN 15.182
3 1.1 NaN profile_1 0.20442 NaN
4 1.2 NaN profile_1 0.20446 15.188
5 1.4 NaN profile_1 NaN 15.182
Consider setting index on each data frame and then run the horizontal merge with pd.concat
:
dfs = [df.set_index(['profile', 'depth']) for df in [df1, df2, df3]]
print(pd.concat(dfs, axis=1).reset_index())
# profile depth VAR1 VAR2 VAR3
# 0 profile_1 0.5 38.198002 NaN NaN
# 1 profile_1 0.6 38.198002 0.20440 NaN
# 2 profile_1 1.1 NaN 0.20442 NaN
# 3 profile_1 1.2 NaN 0.20446 15.188
# 4 profile_1 1.3 38.200001 NaN 15.182
# 5 profile_1 1.4 NaN NaN 15.182