Python: solving unicode hell with unidecode
Use codecs.open
with codecs.open("test.txt", 'r', 'utf-8') as inf:
Edit: The above was for Python 2.x. For Python 3 you don't need to use codecs
, the encoding parameter has been added to regular open
.
with open("test.txt", 'r', encoding='utf-8') as inf:
import codecs
with codecs.open('test.txt', encoding='whicheveronethefilewasencodedwith') as f:
...
The codecs
module provides a function to open files with automatic Unicode encoding/decoding, among other things.