super() raises "TypeError: must be type, not classobj" for new-style class
You can also use class TextParser(HTMLParser, object):
. This makes TextParser
a new-style class, and super()
can be used.
Alright, it's the usual "super()
cannot be used with an old-style class".
However, the important point is that the correct test for "is this a new-style instance (i.e. object)?" is
>>> class OldStyle: pass
>>> instance = OldStyle()
>>> issubclass(instance.__class__, object)
False
and not (as in the question):
>>> isinstance(instance, object)
True
For classes, the correct "is this a new-style class" test is:
>>> issubclass(OldStyle, object) # OldStyle is not a new-style class
False
>>> issubclass(int, object) # int is a new-style class
True
The crucial point is that with old-style classes, the class of an instance and its type are distinct. Here, OldStyle().__class__
is OldStyle
, which does not inherit from object
, while type(OldStyle())
is the instance
type, which does inherit from object
. Basically, an old-style class just creates objects of type instance
(whereas a new-style class creates objects whose type is the class itself). This is probably why the instance OldStyle()
is an object
: its type()
inherits from object
(the fact that its class does not inherit from object
does not count: old-style classes merely construct new objects of type instance
). Partial reference: https://stackoverflow.com/a/9699961/42973.
PS: The difference between a new-style class and an old-style one can also be seen with:
>>> type(OldStyle) # OldStyle creates objects but is not itself a type
classobj
>>> isinstance(OldStyle, type)
False
>>> type(int) # A new-style class is a type
type
(old-style classes are not types, so they cannot be the type of their instances).
super() can be used only in the new-style classes, which means the root class needs to inherit from the 'object' class.
For example, the top class need to be like this:
class SomeClass(object):
def __init__(self):
....
not
class SomeClass():
def __init__(self):
....
So, the solution is that call the parent's init method directly, like this way:
class TextParser(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.all_data = []