numpy.r_ is not a function. What is it?
It's a class instance (aka an object):
In [2]: numpy.r_
Out[2]: <numpy.lib.index_tricks.RClass at 0x1923710>
A class is a construct which is used to define a distinct type - as such a class allows instances of itself. Each instance can have properties (member/instance variables and methods).
One of the methods a class can have is the __getitem__
method, this is called whenever you append [something,something...something]
to the name of the instance. In the case of the numpy.r_
instance the method returns a numpy array.
Take the following class for example:
class myClass(object)
def __getitem__(self,i)
return i*2
Look at these outputs for the above class:
In [1]: a = myClass()
In [2]: a[3]
Out[2]: 6
In [3]: a[3,4]
Out[3]: (3, 4, 3, 4)
I am calling the __getitem__
method of myClass (via the []
parentheses) and the __getitem__
method is returning (the contents of a list * 2 in this case)- it is not the class/instance behaving as a function - it is the __getitem__
function of the myClass
instance which is being called.
On a final note, you will notice that to instantiate myClass
I had to do a = myClass()
whereas to get an instance of RClass
you use numpy.r_
This is because numpy instantiates RClass
and binds it to the name numpy.r_ itself. This is the relevant line in the numpy source code. In my opinion this is rather ugly and confusing!
I would argue that for all purposes r_
is a function, but one implemented by a clever hack using different syntax. Mike already explained how r_
is in reality not a function, but a class instance of RClass
, which has __getitem__
implemented, so that you can use it as r_[1]
. The cosmetic difference is that you use square brackets instead of curved ones, so you are not doing a function call, but you are actually indexing the object. Although this is technically true, for all purposes, it works just like a function call, but one that allows some extra syntax not allowed by a normal function.
The motivation for creating r_
probably comes from Matlab's syntax, which allows to construct arrays in a very compact way, like x = [1:10, 15, 20:10:100]
. To achieve the same in numpy, you would have to do x = np.hstack((np.arange(1,11), 15, np.arange(20,110,10)))
. Using colons to create ranges is not allowed in python, but they do exist in the form of the slice notation to index into a list, like L[3:5]
, and even A[2:10, 20:30]
for multi-dimensional arrays. Under the hood, these index notation gets transformed to a call to the __getitem__
method of the object, where the colon notation gets transformed into a slice object:
In [13]: class C(object):
...: def __getitem__(self, x):
...: print x
In [14]: c = C()
In [15]: c[1:11, 15, 20:110:10]
(slice(1, 11, None), 15, slice(20, 110, 10))
The r_
object 'abuses' this fact to create a 'function' that accepts slice notation, which also does some additional things like concatenating everything together and returning the result, so that you can write x = np.r_[1:11, 15, 20:110:10]
. The "Not a function, so takes no parameters" in the documentation is slightly misleading ...