Numpy matrix of coordinates

The original question was posted six years ago, but I have found myself repeatedly searching for good approaches to this problem and landed here many times. Looking through the Numpy documentation I recently put together an intuitive and dynamic approach for generating n-dimensional coordinates and wanted to post it here for those still hitting this answer.

We use numpy.ndindex() which:

Given the shape of an array, an ndindex instance iterates over the N-dimensional index of the array. At each iteration a tuple of indices is returned, the last dimension is iterated over first.

The best way to understand is to look at an example:

In [100]: for index in np.ndindex(2,2,2):
              print(index)
(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(0, 1, 1)
(1, 0, 0)
(1, 0, 1)
(1, 1, 0)
(1, 1, 1)

That is exactly what we are looking for, so how do we get it into a numpy array format or as a list?

If we want the coordinates as a list we can use:

In [103]: coordList = [x for x in np.ndindex(2,2,2)]
In [104]: print(coordList)
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)]

And if we want the coordinates as a numpy array we can use:

In [105]: coordArray = np.stack([x for x in np.ndindex(2,2,2)])
In [106]: print(coordArray)
[[0 0 0]
 [0 0 1]
 [0 1 0]
 [0 1 1]
 [1 0 0]
 [1 0 1]
 [1 1 0]
 [1 1 1]]

This approach easily scales with varying dimensions and sizes, and with numpy.reshape() we can get exactly the format OP is looking for:

In [117]: answer = np.stack([x for x in np.ndindex(2,2)]).reshape(2,2,2)
In [118]: print(answer)
[[[0 0]
  [0 1]]

 [[1 0]
  [1 1]]]

Which, again, easily extends to larger dimensions:

In [120]: example = np.stack([x for x in np.ndindex(3,3,3)]).reshape(3,3,3,3)
In [121]: print(example)
[[[[0 0 0]
   [0 0 1]
   [0 0 2]]

  [[0 1 0]
   [0 1 1]
   [0 1 2]]

  [[0 2 0]
   [0 2 1]
   [0 2 2]]]


 [[[1 0 0]
   [1 0 1]
   [1 0 2]]

  [[1 1 0]
   [1 1 1]
   [1 1 2]]

  [[1 2 0]
   [1 2 1]
   [1 2 2]]]


 [[[2 0 0]
   [2 0 1]
   [2 0 2]]

  [[2 1 0]
   [2 1 1]
   [2 1 2]]

  [[2 2 0]
   [2 2 1]
   [2 2 2]]]]

Try np.meshgrid([0, 1], [0, 1], [0, 1], indexing="ij"). The meshgrid docs are actually pretty explicit about how the default indexing="xy" produces a funny axis ordering as compared to the non-default indexing="ij", so you can check that for more details. (They're not as clear on why it works this way, alas...)


The numpy function indices can also be used to this effect, its functionality being clear from its name as well.

>>> import numpy as np
>>> np.indices((2,3))
array([[[0, 0, 0],
        [1, 1, 1]],

       [[0, 1, 2],
        [0, 1, 2]]])

which can be thought of as a 2 by 3 matrix of y coordinates and a 2 by 3 matrix of x coordinates (y,x = np.indices((2,3))). It can be recast into the form proposed by Jaime by transposing the axes:

>>> np.indices((2,3)).transpose((1,2,0))

It is functionally equivalent to the meshgrid solution, using indexing='ij', but does not require you to provide the coordinate arrays, which can be a benefit when you have many dimensions.

>>> def f1(shape):
...     return np.array(np.meshgrid(*(np.arange(s) for s in shape), indexing='ij'))
...
>>> shape = (200, 31, 15, 4)
>>> np.all(f1(shape) == np.indices(shape))
True

Timing wise, these solutions are similar, when you take into account the time required for generating the 1-D arrays on which meshgrid operates, but meshgrid returns a list (of arrays), not an nd-array like indices. By adding an extra call to np.array as done in f1 above, indices has a clear advantage over meshgrid:

In [14]: %timeit f1(shape)
100 loops, best of 3: 14 ms per loop

In [15]: %timeit np.indices(shape)
100 loops, best of 3: 5.77 ms per loop

Without the extra call to array:

In [16]: def f2(shape):
    return np.meshgrid(*(np.arange(s) for s in shape), indexing='ij')
   .....: 

In [17]: %timeit f2(shape)
100 loops, best of 3: 5.78 ms per loop

Always be careful with interpreting timings though. This likely won't be the bottleneck in any problem you tackle.

In any case, meshgrid can do many more things than indices can, such as generating a more general rectilinear grid instead of a Cartesian grid, so use them when appropriate. In this case, I'd go with the naming-wise more descriptive indices.


Given the 1D coords:

rows = np.arange(2)
cols = np.arange(3)

I was hoping that this would do the trick:

np.dstack((rows[:, None, None], cols[:, None]))

But apparently dstack and the like require exactly matching dimensions, they will not broadcast them, which I think is a shame.

So this alternative is a little long, but explicit is better than implicit, and you can always wrap it all into a little function:

>>> coords = np.empty((len(rows), len(cols), 2), dtype=np.intp)
>>> coords[..., 0] = rows[:, None]
>>> coords[..., 1] = cols

>>> coords
array([[[0, 0],
        [0, 1],
        [0, 2]],

       [[1, 0],
        [1, 1],
        [1, 2]]])