Which terminal encodings are default on Linux, and which are most common?
The oldest character encoding used in consoles like VT52 was ASCII.
That basic decision has been carried over for many years. Most consoles use ASCII as the most basic character set as defined by ANSI. The next set of encodings (in the west) are the ISO-8859 sets (from 1 to 15). One for each language (language group). Being the most common the ISO-8859-1 (English), and the other in proportion to the corresponding language in use.
Then, the most general list of world characters is Unicode, which, in Linux, is usually encoded in UTF-8.
It is that encoding the most common for present day terminals and programs in Linux.
From more general to particular settings:
OS
The default in debian since Etch on Apr 8th 2007
(13 years ago) has been utf-8.
Note : Fresh Debian/Etch installation have UTF8 enabled by default.
And confirmed on the release notes:
The default encoding for new Debian GNU/Linux installations is UTF-8. A number of applications will also be set up to use UTF-8 by default.
What that means is that Debian (and Ubuntu, Mint, and many other) are utf-8 capable by default.
locale
Which encoding (and country) is actually chosen by the user with the command dpkg-reconfigure locales
is left to user preferences.
That configure the actual particular setting for the computer locale
command.
All of the LC_*
"environment variables" have specific effects on each of country/language sections (parts) as defined by the POSIX spec.
tty
But the above are just "general" settings. A particular terminal may (or may not) match it. Well, in general, the usual encoding for most terminals today is utf8.
The encoding for a particular terminal (tty) may be found if set to utf8 with:
$ stty -a | grep -o '.iutf8'
iutf8
That is, no -
before the result printed.
terminal
But the terminal
(GUI window) inside which the tty terminal is (usually) running also has its own locale setting. If the settings are sane, probably:
$ locale charmap
UTF-8
Will have the correct answer.
But that is just a quick and very shallow look at all the i18n settings of linux/unix.
Take away: Probably, assuming Linux is using utf8 is your best bet.
I would use a similar heuristic you are using with Windows users, but via the LANG environmental variable. For example, on my system:
$ echo $LANG
en_US.UTF-8
Here, the code is saying I am using the English language, but with UTF-8 encoding of filenames and files.
As a general rule, Linux users using UTF-8 will have "UTF-8" at the end of their LANG environmental variable.