numpy vstack vs. column_stack

In the Notes section to column_stack, it points out this:

This function is equivalent to np.vstack(tup).T.

There are many functions in numpy that are convenient wrappers of other functions. For example, the Notes section of vstack says:

Equivalent to np.concatenate(tup, axis=0) if tup contains arrays that are at least 2-dimensional.

It looks like column_stack is just a convenience function for vstack.

hstack stacks horizontally, vstack stacks vertically:

The problem with hstack is that when you append a column you need convert it from 1d-array to a 2d-column first, because 1d array is normally interpreted as a vector-row in 2d context in numpy:

a = np.ones(2)          # 2d, shape = (2, 2)
b = np.array([0, 0])    # 1d, shape = (2,)

hstack((a, b)) -> dimensions mismatch error

So either hstack((a, b[:, None])) or column_stack((a, b)):

where None serves as a shortcut for np.newaxis.

If you're stacking two vectors, you've got three options:

As for the (undocumented) row_stack, it is just a synonym of vstack, as 1d array is ready to serve as a matrix row without extra work.

The case of 3D and above proved to be too huge to fit in the answer, so I've included it in the article called Numpy Illustrated.

I think the following code illustrates the difference nicely:

>>> np.vstack(([1,2,3],[4,5,6]))
array([[1, 2, 3],
       [4, 5, 6]])
>>> np.column_stack(([1,2,3],[4,5,6]))
array([[1, 4],
       [2, 5],
       [3, 6]])
>>> np.hstack(([1,2,3],[4,5,6]))
array([1, 2, 3, 4, 5, 6])

I've included hstack for comparison as well. Notice how column_stack stacks along the second dimension whereas vstack stacks along the first dimension. The equivalent to column_stack is the following hstack command:

>>> np.hstack(([[1],[2],[3]],[[4],[5],[6]]))
array([[1, 4],
       [2, 5],
       [3, 6]])

I hope we can agree that column_stack is more convenient.

numpy vstack vs. column_stack

Tags:

Python

Numpy

Related

Recent Posts