Why doesn't chained (interval) comparison work on numpy arrays?

0 < numlist < 3.5

Is equivalent to:

(0 < numlist) and (numlist < 3.5)

except that numlist is only evaluated once.

The implicit and between the two results is causing the error

So the docs say:

Formally, if a, b, c, ..., y, z are expressions and op1, op2, ..., opN are comparison operators, then a op1 b op2 c ... y opN z is equivalent to a op1 b and b op2 c and ... y opN z, except that each expression is evaluated at most once.

and

(but in both cases z is not evaluated at all when x < y is found to be false).

For a scalar

In [20]: x=5
In [21]: 0<x<10
Out[21]: True
In [22]: 0<x and x<10
Out[22]: True

But with an array

In [24]: x=np.array([4,5,6])    
In [25]: 0<x and x<10
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This ValueError arises when a numpy boolean is used in a context that expects a scalar boolean.

In [26]: (0<x)
Out[26]: array([ True,  True,  True], dtype=bool)

In [30]: np.array([True, False]) or True
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
In [33]: if np.array([True, False]): print('yes')
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

It evaluates the 0<x, but doesn't even get to evaluating the x<10, because it can't use the resulting boolean array in a or/and context. numpy has defined | and &, but not or or and.

In [34]: (0<x) & x<10
Out[34]: array([ True,  True,  True], dtype=bool)

When we use 0 < x <10 we are implicitly expecting to evaluate a vectorized version of the scalar chained expression.

In [35]: f = np.vectorize(lambda x: 0<x<10, otypes=[bool])
In [36]: f(x)
Out[36]: array([ True,  True,  True], dtype=bool)
In [37]: f([-1,5,11])
Out[37]: array([False,  True, False], dtype=bool)

Note that attempting to apply chaining to a list doesn't even get past the first <:

In [39]: 0 < [-1,5,11]
TypeError: unorderable types: int() < list()

This set of expressions indicates that the & operator has precedence over the < operator:

In [44]: 0 < x & x<10
ValueError ...

In [45]: (0 < x) & x<10
Out[45]: array([ True,  True,  True], dtype=bool)

In [46]: 0 < x & (x<10)
Out[46]: array([False,  True, False], dtype=bool)

In [47]: 0 < (x & x)<10
ValueError...

So the safe version is (0 < x) & (x<10), making sure that all < are evaluated before the &.

edit

Here's a further example that confirms the short-cut and evaluation:

In [53]: x=2
In [54]: 3<x<np.arange(4)
Out[54]: False
In [55]: 1<x<np.arange(4)
Out[55]: array([False, False, False,  True])

When 3<x is False, it returns that, without further evaluation.

When it is True, it goes on to evaluate x<np.arange(4), returning a 4 element boolean.

Or with a list that doesn't support < at all:

In [56]: 3<x<[1,2,3]
Out[56]: False
In [57]: 1<x<[1,2,3]
Traceback (most recent call last):
  File "<ipython-input-57-e7430e03ad55>", line 1, in <module>
    1<x<[1,2,3]
TypeError: '<' not supported between instances of 'int' and 'list'

Why doesn't chained (interval) comparison work on numpy arrays?

edit

Tags:

Python 2.7

Numpy

Related

Recent Posts