Python : UnicodeEncodeError when I use grep
If sys.stdout.isatty()
is false (the output is redirected to a file/pipe) then configure PYTHONIOENCODING
envvar outside your script.
Always print Unicode, don't hardcode the character encoding of your environment inside your script:
$ PYTHONIOENCODING=utf-8 python simple.py | grep pattern
print
needs to encode the string before sending to stdout but when the process is in a pipe, the value of sys.stdout.encoding
is None
, so print
receives an unicode
object and then it tries to encode this object using the ascii
codec -- if you have non-ASCII characters in this unicode
object, an exception will be raised.
You can solve this problem encoding all unicode
objects before sending it to the standard output (but you'll need to guess which codec to use). See these examples:
File wrong.py
:
# coding: utf-8
print u'Álvaro'
Result:
alvaro@ideas:/tmp
$ python wrong.py
Álvaro
alvaro@ideas:/tmp
$ python wrong.py | grep a
Traceback (most recent call last):
File "wrong.py", line 3, in <module>
print u'Álvaro'
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in position 0: ordinal not in range(128)
File right.py
:
# coding: utf-8
print u'Álvaro'.encode('utf-8')
# unicode object encoded == `str` in Python 2
Result:
alvaro@ideas:/tmp
$ python right.py
Álvaro
alvaro@ideas:/tmp
$ python right.py | grep a
Álvaro