When should I use hstack/vstack vs append vs concatenate vs column_stack?
If you have two matrices, you're good to go with just hstack
and vstack
:
If you're stacking a matrice and a vector, hstack
becomes tricky to use, so column_stack
is a better option:
If you're stacking two vectors, you've got three options:
And concatenate
in its raw form is useful for 3D and above, see
my article Numpy Illustrated for details.
numpy.vstack: stack arrays in sequence vertically (row wise).Equivalent to np.concatenate(tup, axis=0)
example see: https://docs.scipy.org/doc/numpy/reference/generated/numpy.vstack.html
numpy.hstack: Stack arrays in sequence horizontally (column wise).Equivalent to np.concatenate(tup, axis=1)
, except for 1-D arrays where it concatenates along the first axis. example see:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html
append is a function for python's built-in data structure list
. Each time you add an element to the list. Obviously, To add multiple elements, you will use extend
. Simply put, numpy's functions are much more powerful.
example:
suppose gray.shape = (n0,n1)
np.vstack((gray,gray,gray))
will have shape (n0*3, n1), you can also do it by np.concatenate((gray,gray,gray),axis=0)
np.hstack((gray,gray,gray))
will have shape (n0, n1*3), you can also do it by np.concatenate((gray,gray,gray),axis=1)
np.dstack((gray,gray,gray))
will have shape (n0, n1,3).
In IPython you can look at the source code of a function by typing its name followed by ??
. Taking a look at hstack
we can see that it's actually just a wrapper around concatenate
(similarly with vstack
and column_stack
):
np.hstack??
def hstack(tup):
...
arrs = [atleast_1d(_m) for _m in tup]
# As a special case, dimension 0 of 1-dimensional arrays is "horizontal"
if arrs[0].ndim == 1:
return _nx.concatenate(arrs, 0)
else:
return _nx.concatenate(arrs, 1)
So I guess just use whichever one has the most logical sounding name to you.
All the functions are written in Python except np.concatenate
. With an IPython shell you just use ??
.
If not, here's a summary of their code:
vstack
concatenate([atleast_2d(_m) for _m in tup], 0)
i.e. turn all inputs in to 2d (or more) and concatenate on first
hstack
concatenate([atleast_1d(_m) for _m in tup], axis=<0 or 1>)
colstack
transform arrays with (if needed)
array(arr, copy=False, subok=True, ndmin=2).T
append
concatenate((asarray(arr), values), axis=axis)
In other words, they all work by tweaking the dimensions of the input arrays, and then concatenating on the right axis. They are just convenience functions.
And newer np.stack
:
arrays = [asanyarray(arr) for arr in arrays]
shapes = set(arr.shape for arr in arrays)
result_ndim = arrays[0].ndim + 1
axis = normalize_axis_index(axis, result_ndim)
sl = (slice(None),) * axis + (_nx.newaxis,)
expanded_arrays = [arr[sl] for arr in arrays]
concatenate(expanded_arrays, axis=axis, out=out)
That is, it expands the dims of all inputs (a bit like np.expand_dims
), and then concatenates. With axis=0
, the effect is the same as np.array
.
hstack
documentation now adds:
The functions
concatenate
,stack
andblock
provide more general stacking and concatenation operations.
np.block
is also new. It, in effect, recursively concatenates along the nested lists.