How to fix this AttributeError?
There are one or two issues with the code you posted (mainly to do with initializing the HTMLParser
properly).
Try running this amended version of your script:
from HTMLParser import HTMLParser
class MLStripper(HTMLParser):
def __init__(self):
# initialize the base class
HTMLParser.__init__(self)
def read(self, data):
# clear the current output before re-use
self._lines = []
# re-set the parser's state before re-use
self.reset()
self.feed(data)
return ''.join(self._lines)
def handle_data(self, d):
self._lines.append(d)
def strip_tags(html):
s = MLStripper()
return s.read(html)
html = """Python's <code>easy_install</code>
makes installing new packages extremely convenient.
However, as far as I can tell, it doesn't implement
the other common features of a dependency manager -
listing and removing installed packages."""
print strip_tags(html)
You need to call the init in superclass HTMLParser.
you can also do it by using
class MLStripper(HTMLParser):
def __init__(self):
super(MLStripper, self).__init__()
set()
self.fed = []
This error also appears if you override the reset method in HTMLParser class.
In my case I had added a method named reset for some other functionality and discovered that while Python does not tell you there is a problem with doing this (nor was there any indication I was overriding anything), it breaks the HTMLParser class.