What Unicode characters does pdfLaTeX support with a minimal preamble?
A current LaTeX format will input omsenc.dfu, ot1enc.dfu, t1enc.dfu and ts1enc.dfu (this is new in a current latex compared to the answer you linked too).
You can find all four files in tex/latex/base
and check which unicode input they support. As the names imply their support range is related to output encodings but can there is no strict 1-1 relationship. t1enc.def
e.g. contains also \DeclareUnicodeCharacter{00A0}{\nobreakspace}
.
It is not needed with a current latex to load inputenc. utf8 is the default anyway. So you get this support also with this document:
\documentclass{article}
\usepackage[T1]{fontenc}
\begin{document}
Text goes here.
\end{document}
Your question is rather undefined as "minimal preamble" can be interpreted to mean "the minimal required to support the Unicode Characters needed" which is somewhat circular.
The example preamble posted produces the following if I add Cyrillic text
! Package inputenc Error: Unicode character П (U+041F)
(inputenc) not set up for use with LaTeX.
As Cyrillic codepoints are not set up by default, but independent of the input encoding they would not typeset anyway as T1 font encoding is specified, which only covers Latin alphabet.
You do not need inputenc
in current latex as UTF-8 is the default, and if you specify a font encoding such as X2 that includes Cyrillic, suitable Unicode mappings will be loaded
x2enc.dfu
which is in the base latex distribution.
So this runs without error:
\documentclass{article}
\usepackage[T1,X2]{fontenc}
\begin{document}
{\fontencoding{T1}\selectfont Text goes here}. Привет
\end{document}
The file /usr/local/texlive/2020/texmf-dist/tex/latex/base/utf8enc.dfu
(use kpsewhich utf8enc.dfu
to find the file on your local system) lists all the characters declared in encoding dfu
files in the base distribution but contributed packages may add more.
grep '[.]dfu' `kpsewhich --all ls-R`
will list all the ones available, as well as the core latin Greek and Cyrillic encodings I see armglyphs.dfu
pmboxdrawenc.dfu
otf-hangul.dfu
for example.
Basically the restriction is not on the interpretation of UTF-8. Pdflatex's inputenc code understands the full UTF-8 encoding and so you can specify any Unicode number. But a font in pdflatex can only have 256 characters so most Unicode characters can not be defined until you specify a font to cover the required character set.
If you have a font that covers a Unicode range; the matching inputenc mapping probably already exists (and will be input automatically for any font encoding declared in the preamble) or can easily be added.