How to return a view of several columns in numpy structured array

Building on @HYRY's answer, you could also use ndarray's method getfield:

def fields_view(array, fields):
    return array.getfield(numpy.dtype(
        {name: array.dtype.fields[name] for name in fields}
    ))

As of Numpy version 1.16, the code you propose will return a view. See 'NumPy 1.16.0 Release Notes->Future Changes->multi-field views return a view instead of a copy' on this page:

https://numpy.org/doc/stable/release/1.16.0-notes.html#multi-field-views-return-a-view-instead-of-a-copy


I don't think there is an easy way to achieve what you want. In general, you cannot take an arbitrary view into an array. Try the following:

>>> a
array([(1.5, 2.5, [[1.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
>>> a.view(float)
array([ 1.5,  2.5,  1. ,  2. ,  1. ,  2. ,  3. ,  4. ,  4. ,  5. ,  4. ,
        5. ,  1. ,  3. ,  2. ,  6. ,  2. ,  6. ])

The float view of your record array shows you how the actual data is stored in memory. A view into this data has to be expressible as a combination of a shape, strides and offset into the above data. So if you wanted, for instance, a view of 'x' and 'y' only, you could do the following:

>>> from numpy.lib.stride_tricks import as_strided
>>> b = as_strided(a.view(float), shape=a.shape + (2,),
                   strides=a.strides + a.view(float).strides)
>>> b
array([[ 1.5,  2.5],
       [ 3. ,  4. ],
       [ 1. ,  3. ]])

The as_strided does the same as the perhaps easier to understand:

>>> bb = a.view(float).reshape(a.shape + (-1,))[:, :2]
>>> bb
array([[ 1.5,  2.5],
       [ 3. ,  4. ],
       [ 1. ,  3. ]])

Either of this is a view into a:

>>> b[0,0] =0
>>> a
array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
>>> bb[2, 1] = 0
>>> a
array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
       (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
       (1.0, 0.0, [[2.0, 6.0], [2.0, 6.0]])], 
      dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])

It would be nice if either of this could be converted into a record array, but numpy refuses to do so, the reason not being all that clear to me:

>>> b.view([('x',float), ('y',float)])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: new type not compatible with array.

Of course what works (sort of) for 'x' and 'y' would not work, for instance, for 'x' and 'value', so in general the answer is: it cannot be done.


You can create a dtype object contains only the fields that you want, and use numpy.ndarray() to create a view of original array:

import numpy as np
strc = np.zeros(3, dtype=[('x', int), ('y', float), ('z', int), ('t', "i8")])

def fields_view(arr, fields):
    dtype2 = np.dtype({name:arr.dtype.fields[name] for name in fields})
    return np.ndarray(arr.shape, dtype2, arr, 0, arr.strides)

v1 = fields_view(strc, ["x", "z"])
v1[0] = 10, 100

v2 = fields_view(strc, ["y", "z"])
v2[1:] = [(3.14, 7)]

v3 = fields_view(strc, ["x", "t"])

v3[1:] = [(1000, 2**16)]

print(strc)

here is the output:

[(10, 0.0, 100, 0L) (1000, 3.14, 7, 65536L) (1000, 3.14, 7, 65536L)]