How does __contains__ work for ndarrays?

Seems like numpy's __contains__ is doing something like this for a 2-d case:

def __contains__(self, item):
    for row in self:
        if any(item_value == row_value for item_value, row_value in zip(item, row)):
            return True
    return False

[1,7] works because the 0th element of the first row matches the 0th element of [1,7]. Same with [1,2] etc. With [2,6], the 6 matches the 6 in the last row. With [2,3], none of the elements match a row at the same index. [1, 2, 3] is trivial since the shapes don't match.

See this for more, and also this ticket.


I found the source for ndarray.__contains__, in numpy/core/src/multiarray/sequence.c. As a comment in the source states,

thing in x

is equivalent to

(x == thing).any()

for an ndarray x, regardless of the dimensions of x and thing. This only makes sense when thing is a scalar; the results of broadcasting when thing isn't a scalar cause the weird results I observed, as well as oddities like array([1, 2, 3]) in array(1) that I didn't think to try. The exact source is

static int
array_contains(PyArrayObject *self, PyObject *el)
{
    /* equivalent to (self == el).any() */

    int ret;
    PyObject *res, *any;

    res = PyArray_EnsureAnyArray(PyObject_RichCompare((PyObject *)self,
                                                      el, Py_EQ));
    if (res == NULL) {
        return -1;
    }
    any = PyArray_Any((PyArrayObject *)res, NPY_MAXDIMS, NULL);
    Py_DECREF(res);
    ret = PyObject_IsTrue(any);
    Py_DECREF(any);
    return ret;
}

Tags:

Python

Numpy