Catch UnicodeDecodeError exception while reading file line by line in Python 3
Instead of employing a for
loop, you could call next
on the file-iterator yourself and catch the StopIteration
manually.
with open('file.txt', 'r') as f:
while True:
try:
line = next(f)
# code
except StopIteration:
break
except UnicodeDecodeError:
# code
The Pythonic way is probably to register an error handler with codecs.register_error_handler('special', handler)
and declare it in the open function:
with open('file.txt', 'r', error='special') as f:
...
That way if there is an offending line, the handler
will the called with the UnicodeDecodeError
, and will be able to return a replacement string or re-raise the error.
If you want a more evident processing, an alternate way would be to open the file in binary mode and explicitely decode each line:
with open('file.txt', 'rb') as f:
for bline in f:
try:
line = bline.decode()
print(line)
except UnicodeDecodeError as e:
# process error