LESSCHARSET=utf-8 less doesn't seem to work

  1. What does the locale command output? Is it a UTF-8 locale?

  2. Are you sure your terminal is set to display UTF-8? Does echo -e '\xe2\x82\xac' produce the € (euro) sign?

  3. Is the locale that you have set even installed on the system? Is it present in the list that locale -a outputs?

  4. What version of less are you using? (Run less --version to find out.) Really, really old versions did not even support LESSCHARSET. This is less likely to be the case, because I have a Debian "sarge" system with less version 382, and it does not even need LESSCHARSET if the locale is set correctly.


My guess is that your file isn't UTF8 but rather ISO8859. (Is the <F4> character supposed to be a 'ô'?)

Start an xterm with LANG=en_US.ISO-8859-1 xterm. Then verify the locale (the output of locale should be something like en_US.ISO-8859-1). Then use less to view the file. Does it display correctly?

Note that it isn't enough to just use LESSCHARSET=iso8859 without starting a new terminal. LESSCHARSET tells less to think that the terminal can interpret iso8859, but your terminal probably displays UTF8, since the euro sign displays correctly. But as \xf4 isn't a valid utf8 character, the terminal will probably show something like '�'.


Try the command file file.txt.  If, for example, the output is "ISO-8859 English text" then change the encoding of the file from ISO-8859 to UTF-8 via the command iconv -f ISO-8859-1 -t UTF-8 -o testfile.txt file.txt.  If less testfile.txt displays correctly, finish with mv testfile.txt file.txt.

Tags:

Unix

Utf 8