Python ASCII codec can't encode character error during write to CSV
Python 2.x CSV library is broken. You have three options. In order of complexity:
Edit: See below
Use the fixed library https://github.com/jdunck/python-unicodecsv (pip install unicodecsv
). Use as a drop-in replacement - Example:with open("myfile.csv", 'rb') as my_file: r = unicodecsv.DictReader(my_file, encoding='utf-8')
Read the CSV manual regarding Unicode: https://docs.python.org/2/library/csv.html (See examples at the bottom)
Manually encode each item as UTF-8:
for cell in row.findAll('td'): text = cell.text.replace('[','').replace(']','') list_of_cells.append(text.encode("utf-8"))
Edit, I found python-unicodecsv is also broken when reading UTF-16. It complains about any 0x00
bytes.
Instead, use https://github.com/ryanhiebert/backports.csv, which more closely resembles Python 3 implementation and uses io
module..
Install:
pip install backports.csv
Usage:
from backports import csv
import io
with io.open(filename, encoding='utf-8') as f:
r = csv.reader(f):