Vim shows strange characters <91>,<92>
91 and 92 are the hex codes for open and close curly apostrophe (single quote) in the MS Windows default version of the latin1/ISO-8859-1 encoding, which is more specifically called cp1252/Windows-1252 (where cp stands for code page).
These characters are most often inserted by people copying content from Word documents / Outlook emails as part of the "Smart Quotes" feature. Other problem characters in this code page are hex 93/94 which are open and close double quotes, bullet point (•) and OE ligature (œ and Œ). You can see a full list of the "problem characters", the ones that don't map directly into ISO-8859-1 or UTF-8 with the same code, on the Wikipeda page for cp1252 highlighted in green.
If all you want is to open the file in the correct encoding then use the ++enc=cp1252 option to the :e command:
:e ++enc=1252 filename.txt
You can replace a particular bad hex code in Vim with the substitute command (:s) and one of the code substitutions:
\d123 decimal number of character
\o40 octal number of character up to 0377
\x20 hexadecimal number of character up to 0xff
\u20AC hex. number of multibyte character up to 0xffff
\U1234 hex. number of multibyte character up to 0xffffffff
To change the hex 91/92 characters in you need to do:
:%s/[\x91\x92]/'/g
The content on your source web page was overzealously reformatted. The text was undoubtedly supposed to use (straight) single quotes (ASCII 39/0x27
, U+0027
) instead of curly single quotes (U+2018
and U+2019
, which are 0x91 and 0x92
in CP1252 (also known as MS-ANSI and WINDOWS-1252; a common 8-bit encoding on Windows)).
Vim is showing you the hex codes because they are not valid in whatever encoding Vim is using (probably UTF-8). If you are editing text that has already been saved in a file, then you can reload the file as CP1252 with :e ++enc=cp1252
; this should make the curly quotes visible. But there is no real reason to reload it as CP1252, just delete the 0x91
and 0x92
characters and replace them with single quotes.
Use iconv
to convert the text file from CP1252 to UTF-8 before opening.
iconv -f cp1252 -t utf8 inputfile.csv > outputfile.csv
On Mac OS use this:
iconv -f cp1252 -t UTF8-MAC inputfile.csv > outputfile.csv