Logarithmic returns in pandas dataframe
Here is one way to calculate log return using .shift()
. And the result is similar to but not the same as the gross return calculated by pct_change()
. Can you upload a copy of your sample data (dropbox share link) to reproduce the inconsistency you saw?
import pandas as pd
import numpy as np
np.random.seed(0)
df = pd.DataFrame(100 + np.random.randn(100).cumsum(), columns=['price'])
df['pct_change'] = df.price.pct_change()
df['log_ret'] = np.log(df.price) - np.log(df.price.shift(1))
Out[56]:
price pct_change log_ret
0 101.7641 NaN NaN
1 102.1642 0.0039 0.0039
2 103.1429 0.0096 0.0095
3 105.3838 0.0217 0.0215
4 107.2514 0.0177 0.0176
5 106.2741 -0.0091 -0.0092
6 107.2242 0.0089 0.0089
7 107.0729 -0.0014 -0.0014
.. ... ... ...
92 101.6160 0.0021 0.0021
93 102.5926 0.0096 0.0096
94 102.9490 0.0035 0.0035
95 103.6555 0.0069 0.0068
96 103.6660 0.0001 0.0001
97 105.4519 0.0172 0.0171
98 105.5788 0.0012 0.0012
99 105.9808 0.0038 0.0038
[100 rows x 3 columns]
Log returns are simply the natural log of 1 plus the arithmetic return. So how about this?
df['pct_change'] = df.price.pct_change()
df['log_return'] = np.log(1 + df.pct_change)
Even more concise, utilizing Ximix's suggestion:
df['log_return'] = np.log1p(df.price.pct_change())