How to add column to numpy array

The easiest solution is to use numpy.insert().

The Advantage of np.insert() over np.append is that you can insert the new columns into custom indices.

import numpy as np

X = np.arange(20).reshape(10,2)

X = np.insert(X, [0,2], np.random.rand(X.shape[0]*2).reshape(-1,2)*10, axis=1)
'''


I think that your problem is that you are expecting np.append to add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays

Returns
-------
append : ndarray
    A copy of `arr` with `values` appended to `axis`.  Note that `append`
    does not occur in-place: a new array is allocated and filled.  If
    `axis` is None, `out` is a flattened array.

so you need to save the output all_data = np.append(...):

my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)

Alternative ways:

all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)

I believe that the only difference between these three functions (as well as np.vstack) are their default behaviors for when axis is unspecified:

  • concatenate assumes axis = 0
  • hstack assumes axis = 1 unless inputs are 1d, then axis = 0
  • vstack assumes axis = 0 after adding an axis if inputs are 1d
  • append flattens array

Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a field to a record array. You imported both genfromtxt which returns a structured array and recfromcsv which returns the subtly different record array (recarray). You used the recfromcsv so right now my_data is actually a recarray, which means that most likely my_data.shape = (210,) since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.

So you could try this:

import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
#       (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
#       (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588  , 2.121903762680979 ),
#       (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
#       (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675  , 1.4957409515009568),
#       (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308  , 2.4853911958174133),
#       (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103  , 1.275756904913104 ),
#       (0.684075052174589  , 0.6654774682866273 , 0.5246593820025259  , 1.8742119024637423),
#       (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
#       (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)], 
#      dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')

If you have an array, a of say 210 rows by 8 columns:

a = numpy.empty([210,8])

and want to add a ninth column of zeros you can do this:

b = numpy.append(a,numpy.zeros([len(a),1]),1)

Tags:

Python

Numpy