How to join many fragmented time series in one regular Pandas DataFrame in Python
First join both DataFrame
s together by concat
with DataFrame.set_index
and if possible duplicates use sum for unique MultiIndex
created by timestamps and Sensor
s.
Then add missing rows with DataFrame.reindex
by MultiIndex.from_product
with minumal and maximal dates by date_range
:
df = (pd.concat([df_a.set_index(['Timestamp','Sensor']),
df_b.set_index(['Timestamp','Sensor'])], sort=True)
.sum(level=[0,1],min_count=1))
d = df.index.get_level_values(0)
mux = pd.MultiIndex.from_product([pd.date_range(d.min(), d.max(), freq='5Min'),
df.index.get_level_values(1).unique()], names=df.index.names)
df = df.reindex(mux).reset_index()
print (df)
Timestamp Sensor Humidity Pressure Temperature
0 2019-05-25 10:00:00 Sensor_1 60.0 NaN 25.0
1 2019-05-25 10:00:00 Sensor_2 45.0 NaN 30.0
2 2019-05-25 10:00:00 Sensor_3 NaN NaN NaN
3 2019-05-25 10:05:00 Sensor_1 NaN 1020.0 26.0
4 2019-05-25 10:05:00 Sensor_2 46.0 956.0 30.0
5 2019-05-25 10:05:00 Sensor_3 NaN 990.0 NaN
6 2019-05-25 10:10:00 Sensor_1 63.0 1021.0 27.0
7 2019-05-25 10:10:00 Sensor_2 NaN 957.0 NaN
8 2019-05-25 10:10:00 Sensor_3 NaN 992.0 NaN
9 2019-05-25 10:15:00 Sensor_1 NaN 1019.0 NaN
10 2019-05-25 10:15:00 Sensor_2 NaN NaN NaN
11 2019-05-25 10:15:00 Sensor_3 NaN NaN NaN
12 2019-05-25 10:20:00 Sensor_1 62.0 NaN 28.0
13 2019-05-25 10:20:00 Sensor_2 NaN NaN NaN
14 2019-05-25 10:20:00 Sensor_3 NaN NaN NaN