How to check if all values in the columns of a numpy matrix are the same?
Given ubuntu's awesome explanation, you can use reduce
to solve your problem, but you have to apply it to bitwise_and
and bitwise_or
rather than equal
. As a consequence, this will not work with floating point arrays:
In [60]: np.bitwise_and.reduce(a) == a[0]
Out[60]: array([ True, False, True], dtype=bool)
In [61]: np.bitwise_and.reduce(b) == b[0]
Out[61]: array([ True, False, True], dtype=bool)
Basically, you are comparing the bits of each element in the column. Identical bits are unchanged. Different bits are set to zero. This way, any number that has a zero instead of a one bit will change the reduced value. bitwise_and
will not trap the case where bits are introduced rather than removed:
In [62]: c = np.array([[1,0,0],[1,0,0],[1,0,0],[1,1,0]])
In [63]: c
Out[63]:
array([[1, 0, 0],
[1, 0, 0],
[1, 0, 0],
[1, 1, 0]])
In [64]: np.bitwise_and.reduce(c) == c[0]
Out[64]: array([ True, True, True], dtype=bool)
The second coumn is clearly wrong. We need to use bitwise_or
to trap new bits:
In [66]: np.bitwise_or.reduce(c) == c[0]
Out[66]: array([ True, False, True], dtype=bool)
Final Answer
In [69]: np.logical_and(np.bitwise_or.reduce(a) == a[0], np.bitwise_and.reduce(a) == a[0])
Out[69]: array([ True, False, True], dtype=bool)
In [70]: np.logical_and(np.bitwise_or.reduce(b) == b[0], np.bitwise_and.reduce(b) == b[0])
Out[70]: array([ True, False, True], dtype=boo
In [71]: np.logical_and(np.bitwise_or.reduce(c) == c[0], np.bitwise_and.reduce(c) == c[0])
Out[71]: array([ True, False, True], dtype=bool)
This method is more restrictive and less elegant than ubunut's suggestion of using all
, but it has the advantage of not creating enormous temporary arrays if your input is enormous. The temporary arrays should only be as big as the first row of your matrix.
EDIT
Based on this Q/A and the bug I filed with numpy, the solution provided only works because your array contains zeros and ones. As it happens, the bitwise_and.reduce()
operations shown can only ever return zero or one because bitwise_and.identity
is 1
, not -1
. I am keeping this answer in the hope that numpy
gets fixed and the answer becomes valid.
Edit
Looks like there will in fact be a change to numpy soon. Certainly to bitwise_and.identity
, and also possibly an optional parameter to reduce.
Edit
Good news everyone. The identity for np.bitwise_and
has been set to -1
as of version 1.12.0
.
Not as elegant but may also work in the example above.
a = np.array([[1,1,0],[1,-1,0],[1,0,0],[1,1,0]])
take the difference between the each row and the one above it
np.diff(a,axis=0)==0
array([[ True, False, True],
[ True, False, True],
[ True, False, True]])
In [45]: a
Out[45]:
array([[1, 1, 0],
[1, 0, 0],
[1, 0, 0],
[1, 1, 0]])
Compare each value to the corresponding value in the first row:
In [46]: a == a[0,:]
Out[46]:
array([[ True, True, True],
[ True, False, True],
[ True, False, True],
[ True, True, True]], dtype=bool)
A column shares a common value if all the values in that column are True:
In [47]: np.all(a == a[0,:], axis = 0)
Out[47]: array([ True, False, True], dtype=bool)
The problem with np.equal.reduce
can be seen by micro-analyzing what happens when it is applied to [1, 0, 0, 1]
:
In [49]: np.equal.reduce([1, 0, 0, 1])
Out[50]: True
The first two items, 1
and 0
are tested for equality and the result is False
:
In [51]: np.equal.reduce([False, 0, 1])
Out[51]: True
Now False
and 0
are tested for equality and the result is True
:
In [52]: np.equal.reduce([True, 1])
Out[52]: True
But True
and 1 are equal, so the total result is True
, which is not the desired outcome.
The problem is that reduce
tries to accumulate the result "locally", while we want a "global" test like np.all
.