Creating a Pandas rolling-window series of arrays
Your data look like a strided array :
data=np.lib.stride_tricks.as_strided(np.concatenate(([NaN]*2,s))[2:],(5,3),(8,-8))
"""
array([[ 1. , nan, nan],
[ 1.1, 1. , nan],
[ 1.2, 1.1, 1. ],
[ 1.3, 1.2, 1.1],
[ 1.4, 1.3, 1.2]])
"""
Then transform in Series :
pd.Series(map(list,data))
""""
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
""""
Here's a vectorized approach using NumPy broadcasting
-
n = 3 # window length
idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1
out = s.get_values()[idx]
out[idx<0] = np.nan
This gets you the output as a 2D array.
To get a series with each element holding each window as a list -
In [40]: pd.Series(out.tolist())
Out[40]:
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
If you wish to have a list of 1D arrays split arrays, you can use np.split
on the output, like so -
out_split = np.split(out,out.shape[0],axis=0)
Sample run -
In [100]: s
Out[100]:
1 1.0
2 1.1
3 1.2
4 1.3
5 1.4
dtype: float64
In [101]: n = 3
In [102]: idx = np.arange(n)[::-1] + np.arange(len(s))[:,None] - n + 1
...: out = s.get_values()[idx]
...: out[idx<0] = np.nan
...:
In [103]: out
Out[103]:
array([[ 1. , nan, nan],
[ 1.1, 1. , nan],
[ 1.2, 1.1, 1. ],
[ 1.3, 1.2, 1.1],
[ 1.4, 1.3, 1.2]])
In [104]: np.split(out,out.shape[0],axis=0)
Out[104]:
[array([[ 1., nan, nan]]),
array([[ 1.1, 1. , nan]]),
array([[ 1.2, 1.1, 1. ]]),
array([[ 1.3, 1.2, 1.1]]),
array([[ 1.4, 1.3, 1.2]])]
Memory-efficiency with strides
For memory efficiency, we can use a strided one - strided_axis0
, similar to @B. M.'s solution
, but a bit more generic one.
So, to get 2D array of values with NaNs precedding the first element -
In [35]: strided_axis0(s.values, fillval=np.nan, L=3)
Out[35]:
array([[nan, nan, 1. ],
[nan, 1. , 1.1],
[1. , 1.1, 1.2],
[1.1, 1.2, 1.3],
[1.2, 1.3, 1.4]])
To get 2D array of values with NaNs as fillers coming after the original elements in each row and the order of elements being flipped, as stated in the problem -
In [36]: strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1]
Out[36]:
array([[1. , nan, nan],
[1.1, 1. , nan],
[1.2, 1.1, 1. ],
[1.3, 1.2, 1.1],
[1.4, 1.3, 1.2]])
To get a series with each element holding each window as a list, simply wrap the earlier methods with pd.Series(out.tolist())
with out
being the 2D
array outputs -
In [38]: pd.Series(strided_axis0(s.values, fillval=np.nan, L=3)[:,::-1].tolist())
Out[38]:
0 [1.0, nan, nan]
1 [1.1, 1.0, nan]
2 [1.2, 1.1, 1.0]
3 [1.3, 1.2, 1.1]
4 [1.4, 1.3, 1.2]
dtype: object
If you attach the missing nan
s at the beginning and the end of the series, you use a simple window
def wndw(s,size=3):
stretched = np.hstack([
np.array([np.nan]*(size-1)),
s.values.T,
np.array([np.nan]*size)
])
for begin in range(len(stretched)-size):
end = begin+size
yield stretched[begin:end][::-1]
for arr in wndw(s, 3):
print arr
Here's one way to do it
In [294]: arr = [s.shift(x).values[::-1][:3] for x in range(len(s))[::-1]]
In [295]: arr
Out[295]:
[array([ 1., nan, nan]),
array([ 1.1, 1. , nan]),
array([ 1.2, 1.1, 1. ]),
array([ 1.3, 1.2, 1.1]),
array([ 1.4, 1.3, 1.2])]
In [296]: pd.Series(arr, index=s.index)
Out[296]:
1 [1.0, nan, nan]
2 [1.1, 1.0, nan]
3 [1.2, 1.1, 1.0]
4 [1.3, 1.2, 1.1]
5 [1.4, 1.3, 1.2]
dtype: object