Understanding Python super() with __init__() methods
I'm trying to understand
super()
The reason we use super
is so that child classes that may be using cooperative multiple inheritance will call the correct next parent class function in the Method Resolution Order (MRO).
In Python 3, we can call it like this:
class ChildB(Base):
def __init__(self):
super().__init__()
In Python 2, we were required to call super
like this with the defining class's name and self
, but we'll avoid this from now on because it's redundant, slower (due to the name lookups), and more verbose (so update your Python if you haven't already!):
super(ChildB, self).__init__()
Without super, you are limited in your ability to use multiple inheritance because you hard-wire the next parent's call:
Base.__init__(self) # Avoid this.
I further explain below.
"What difference is there actually in this code?:"
class ChildA(Base):
def __init__(self):
Base.__init__(self)
class ChildB(Base):
def __init__(self):
super().__init__()
The primary difference in this code is that in ChildB
you get a layer of indirection in the __init__
with super
, which uses the class in which it is defined to determine the next class's __init__
to look up in the MRO.
I illustrate this difference in an answer at the canonical question, How to use 'super' in Python?, which demonstrates dependency injection and cooperative multiple inheritance.
If Python didn't have super
Here's code that's actually closely equivalent to super
(how it's implemented in C, minus some checking and fallback behavior, and translated to Python):
class ChildB(Base):
def __init__(self):
mro = type(self).mro()
check_next = mro.index(ChildB) + 1 # next after *this* class.
while check_next < len(mro):
next_class = mro[check_next]
if '__init__' in next_class.__dict__:
next_class.__init__(self)
break
check_next += 1
Written a little more like native Python:
class ChildB(Base):
def __init__(self):
mro = type(self).mro()
for next_class in mro[mro.index(ChildB) + 1:]: # slice to end
if hasattr(next_class, '__init__'):
next_class.__init__(self)
break
If we didn't have the super
object, we'd have to write this manual code everywhere (or recreate it!) to ensure that we call the proper next method in the Method Resolution Order!
How does super do this in Python 3 without being told explicitly which class and instance from the method it was called from?
It gets the calling stack frame, and finds the class (implicitly stored as a local free variable, __class__
, making the calling function a closure over the class) and the first argument to that function, which should be the instance or class that informs it which Method Resolution Order (MRO) to use.
Since it requires that first argument for the MRO, using super
with static methods is impossible as they do not have access to the MRO of the class from which they are called.
Criticisms of other answers:
super() lets you avoid referring to the base class explicitly, which can be nice. . But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.
It's rather hand-wavey and doesn't tell us much, but the point of super
is not to avoid writing the parent class. The point is to ensure that the next method in line in the method resolution order (MRO) is called. This becomes important in multiple inheritance.
I'll explain here.
class Base(object):
def __init__(self):
print("Base init'ed")
class ChildA(Base):
def __init__(self):
print("ChildA init'ed")
Base.__init__(self)
class ChildB(Base):
def __init__(self):
print("ChildB init'ed")
super().__init__()
And let's create a dependency that we want to be called after the Child:
class UserDependency(Base):
def __init__(self):
print("UserDependency init'ed")
super().__init__()
Now remember, ChildB
uses super, ChildA
does not:
class UserA(ChildA, UserDependency):
def __init__(self):
print("UserA init'ed")
super().__init__()
class UserB(ChildB, UserDependency):
def __init__(self):
print("UserB init'ed")
super().__init__()
And UserA
does not call the UserDependency method:
>>> UserA()
UserA init'ed
ChildA init'ed
Base init'ed
<__main__.UserA object at 0x0000000003403BA8>
But UserB
does in-fact call UserDependency because ChildB
invokes super
:
>>> UserB()
UserB init'ed
ChildB init'ed
UserDependency init'ed
Base init'ed
<__main__.UserB object at 0x0000000003403438>
Criticism for another answer
In no circumstance should you do the following, which another answer suggests, as you'll definitely get errors when you subclass ChildB:
super(self.__class__, self).__init__() # DON'T DO THIS! EVER.
(That answer is not clever or particularly interesting, but in spite of direct criticism in the comments and over 17 downvotes, the answerer persisted in suggesting it until a kind editor fixed his problem.)
Explanation: Using self.__class__
as a substitute for the class name in super()
will lead to recursion. super
lets us look up the next parent in the MRO (see the first section of this answer) for child classes. If you tell super
we're in the child instance's method, it will then lookup the next method in line (probably this one) resulting in recursion, probably causing a logical failure (in the answerer's example, it does) or a RuntimeError
when the recursion depth is exceeded.
>>> class Polygon(object):
... def __init__(self, id):
... self.id = id
...
>>> class Rectangle(Polygon):
... def __init__(self, id, width, height):
... super(self.__class__, self).__init__(id)
... self.shape = (width, height)
...
>>> class Square(Rectangle):
... pass
...
>>> Square('a', 10, 10)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __init__
TypeError: __init__() missing 2 required positional arguments: 'width' and 'height'
Python 3's new super()
calling method with no arguments fortunately allows us to sidestep this issue.
It's been noted that in Python 3.0+ you can use
super().__init__()
to make your call, which is concise and does not require you to reference the parent OR class names explicitly, which can be handy. I just want to add that for Python 2.7 or under, some people implement a name-insensitive behaviour by writing self.__class__
instead of the class name, i.e.
super(self.__class__, self).__init__() # DON'T DO THIS!
HOWEVER, this breaks calls to super
for any classes that inherit from your class, where self.__class__
could return a child class. For example:
class Polygon(object):
def __init__(self, id):
self.id = id
class Rectangle(Polygon):
def __init__(self, id, width, height):
super(self.__class__, self).__init__(id)
self.shape = (width, height)
class Square(Rectangle):
pass
Here I have a class Square
, which is a sub-class of Rectangle
. Say I don't want to write a separate constructor for Square
because the constructor for Rectangle
is good enough, but for whatever reason I want to implement a Square so I can reimplement some other method.
When I create a Square
using mSquare = Square('a', 10,10)
, Python calls the constructor for Rectangle
because I haven't given Square
its own constructor. However, in the constructor for Rectangle
, the call super(self.__class__,self)
is going to return the superclass of mSquare
, so it calls the constructor for Rectangle
again. This is how the infinite loop happens, as was mentioned by @S_C. In this case, when I run super(...).__init__()
I am calling the constructor for Rectangle
but since I give it no arguments, I will get an error.
super()
lets you avoid referring to the base class explicitly, which can be nice. But the main advantage comes with multiple inheritance, where all sorts of fun stuff can happen. See the standard docs on super if you haven't already.
Note that the syntax changed in Python 3.0: you can just say super().__init__()
instead of super(ChildB, self).__init__()
which IMO is quite a bit nicer. The standard docs also refer to a guide to using super()
which is quite explanatory.