numpy.r_ is not a function. What is it?

It's a class instance (aka an object):

In [2]: numpy.r_
Out[2]: <numpy.lib.index_tricks.RClass at 0x1923710>

A class is a construct which is used to define a distinct type - as such a class allows instances of itself. Each instance can have properties (member/instance variables and methods).

One of the methods a class can have is the __getitem__ method, this is called whenever you append [something,something...something] to the name of the instance. In the case of the numpy.r_ instance the method returns a numpy array.

Take the following class for example:

class myClass(object)
    def __getitem__(self,i)
        return i*2

Look at these outputs for the above class:

In [1]: a = myClass()

In [2]: a[3]
Out[2]: 6

In [3]: a[3,4]
Out[3]: (3, 4, 3, 4)

I am calling the __getitem__ method of myClass (via the [] parentheses) and the __getitem__ method is returning (the contents of a list * 2 in this case)- it is not the class/instance behaving as a function - it is the __getitem__ function of the myClass instance which is being called.

On a final note, you will notice that to instantiate myClass I had to do a = myClass() whereas to get an instance of RClass you use numpy.r_ This is because numpy instantiates RClass and binds it to the name numpy.r_ itself. This is the relevant line in the numpy source code. In my opinion this is rather ugly and confusing!


I would argue that for all purposes r_ is a function, but one implemented by a clever hack using different syntax. Mike already explained how r_ is in reality not a function, but a class instance of RClass, which has __getitem__ implemented, so that you can use it as r_[1]. The cosmetic difference is that you use square brackets instead of curved ones, so you are not doing a function call, but you are actually indexing the object. Although this is technically true, for all purposes, it works just like a function call, but one that allows some extra syntax not allowed by a normal function.

The motivation for creating r_ probably comes from Matlab's syntax, which allows to construct arrays in a very compact way, like x = [1:10, 15, 20:10:100]. To achieve the same in numpy, you would have to do x = np.hstack((np.arange(1,11), 15, np.arange(20,110,10))). Using colons to create ranges is not allowed in python, but they do exist in the form of the slice notation to index into a list, like L[3:5], and even A[2:10, 20:30] for multi-dimensional arrays. Under the hood, these index notation gets transformed to a call to the __getitem__ method of the object, where the colon notation gets transformed into a slice object:

In [13]: class C(object):
    ...:     def __getitem__(self, x):
    ...:         print x

In [14]: c = C()

In [15]: c[1:11, 15, 20:110:10]
(slice(1, 11, None), 15, slice(20, 110, 10))

The r_ object 'abuses' this fact to create a 'function' that accepts slice notation, which also does some additional things like concatenating everything together and returning the result, so that you can write x = np.r_[1:11, 15, 20:110:10]. The "Not a function, so takes no parameters" in the documentation is slightly misleading ...