numpy: How to add a column to an existing structured array?

Have you tried using numpy's recfunctions?

import numpy.lib.recfunctions as rfn

It has some very useful functions for structured arrays.

For your case, I think it could be accomplished with:

a = rfn.append_fields(a, 'USNG', np.empty(a.shape[0], dtype='|S100'), dtypes='|S100')

Tested here and it worked.


As GMSL mentioned in the comments. It is possible to do that with rfn.merge_arrays like below:

a = np.array([(1, [-112.01268501699997, 40.64249414272372]),
       (2, [-111.86145708699996, 40.4945008710162])], 
      dtype=[('i', '<i8'), ('loc', '<f8', (2,))])
a2 = np.full(a.shape[0], '', dtype=[('USNG', '|S100')])
a3 = rfn.merge_arrays((a, a2), flatten=True)

a3 will have the value:

array([(1, [-112.01268502,   40.64249414], b''),
       (2, [-111.86145709,   40.49450087], b'')],
      dtype=[('i', '<i8'), ('loc', '<f8', (2,)), ('USNG', 'S100')])

You have to create a new dtype that contains the new field.

For example, here's a:

In [86]: a
array([(1, [-112.01268501699997, 40.64249414272372]),
       (2, [-111.86145708699996, 40.4945008710162])], 
      dtype=[('i', '<i8'), ('loc', '<f8', (2,))])

a.dtype.descr is [('i', '<i8'), ('loc', '<f8', (2,))]; i.e. a list of field types. We'll create a new dtype by adding ('USNG', 'S100') to the end of that list:

In [87]: new_dt = np.dtype(a.dtype.descr + [('USNG', 'S100')])

Now create a new structured array, b. I used zeros here, so the string fields will start out with the value ''. You could also use empty. The strings will then contain garbage, but that won't matter if you immediately assign values to them.

In [88]: b = np.zeros(a.shape, dtype=new_dt)

Copy over the existing data from a to b:

In [89]: b['i'] = a['i']

In [90]: b['loc'] = a['loc']

Here's b now:

In [91]: b
array([(1, [-112.01268501699997, 40.64249414272372], ''),
       (2, [-111.86145708699996, 40.4945008710162], '')], 
      dtype=[('i', '<i8'), ('loc', '<f8', (2,)), ('USNG', 'S100')])

Fill in the new field with some data:

In [93]: b['USNG'] = ['FOO', 'BAR']

In [94]: b
array([(1, [-112.01268501699997, 40.64249414272372], 'FOO'),
       (2, [-111.86145708699996, 40.4945008710162], 'BAR')], 
      dtype=[('i', '<i8'), ('loc', '<f8', (2,)), ('USNG', 'S100')])