Pandas equivalent of Oracle Lead/Lag function
You could perform a groupby/apply (shift) operation:
In [15]: df['Data_lagged'] = df.groupby(['Group'])['Data'].shift(1)
In [16]: df
Out[16]:
Date Group Data Data_lagged
2014-05-14 09:10:00 A 1 NaN
2014-05-14 09:20:00 A 2 1
2014-05-14 09:30:00 A 3 2
2014-05-14 09:40:00 A 4 3
2014-05-14 09:50:00 A 5 4
2014-05-14 10:00:00 B 1 NaN
2014-05-14 10:10:00 B 2 1
2014-05-14 10:20:00 B 3 2
2014-05-14 10:30:00 B 4 3
[9 rows x 4 columns]
To obtain the ORDER BY Date ASC
effect, you must sort the DataFrame first:
df['Data_lagged'] = (df.sort_values(by=['Date'], ascending=True)
.groupby(['Group'])['Data'].shift(1))
For lead operation in pandas, one need to just use shift(-1)
instead of 1
df['Data_lead'] = df.groupby(['Group'])['Data'].shift(-1)