Speeding up an iloc solution within a pandas dataframe
A trick to vectorize is to rewrite everything as cumsums.
In [11]: x = df["A"].shift(-1).cumsum().shift().fillna(0)
In [12]: x
Out[12]:
2015-01-01 0
2015-01-02 10
2015-01-03 13
2015-01-04 17
Name: A, dtype: float64
In [13]: df["B"].cumsum() - x
Out[13]:
2015-01-01 10
2015-01-02 0
2015-01-03 -3
2015-01-04 -7
dtype: float64
In [14]: df["B"].cumsum() - x + 2 * df["A"]
Out[14]:
2015-01-01 20
2015-01-02 20
2015-01-03 3
2015-01-04 1
dtype: float64
Note: The first value is a special case so you have to adjust that back to 3.
Recursive things like this can be hard to vectorize. numba
usually handles them well - if you need to redistribute your code, cython
may be a better choice as it produces regular c-extensions with no extra dependencies.
In [88]: import numba
In [89]: @numba.jit(nopython=True)
...: def logic(a, b, c):
...: N = len(a)
...: out = np.zeros((N, 2), dtype=np.int64)
...: for i in range(N):
...: if i == 0:
...: out[i, 0] = b[i]
...: out[i, 1] = c[i]
...: else:
...: out[i, 0] = out[i-1,0] - a[i]
...: out[i, 1] = out[i-1,0] + a[i]
...: return out
In [90]: logic(df.A.values, df.B.values, df.C.values)
Out[90]:
array([[10, 3],
[ 0, 20],
[-3, 3],
[-7, 1]], dtype=int64)
In [91]: df[['A','B']] = logic(df.A.values, df.B.values, df.C.values)
Edit: As shown in the other answers, this problem can actually be vectorized, which you should probably use.