Converting "unknown-8bit" charset to UTF-8
You can use enca
or chardet
, enca
will probably be more successful.
If you know the language the document was written in, you can guess the encoding and try converting until you get the right results:
English, French, German, Spanish... – usually Windows-1252
Russian, Ukrainian... – usually Windows-1251
Polish, Czech, Hungarian... – usually Windows-1250 or ISO-8859-2
Japanese – usually Shift-JIS
and so on.