LaTeX Theory - How Symbols are Modeled Under the Hood
TeX knows thirteen kind of atoms in math formulas and build upon them, just like any formula in mathematics is built upon atomic ones.
The atoms are Ord, Op, Rel, Bin, Open, Close, Punct, Inner, Over, Under, Acc, Rad and Vcent.
Actually only the first eight are eventually considered, because the last five are converted to Ord ones.
Every atom has three fields: nucleus, subscript and superscript, which in turn can contain other atoms. Again the last five types are special in this account, because only the nucleus makes real sense.
Ord is for “ordinary” symbols such as variables. Op is for “operators” such as \sum
or \log
. Rel and Bin are for “relation” and “operation” symbols (such as <
or +
). Open and Close refer to fences such as parentheses. Punct for punctuation signs (the comma or semicolon).
An Inner atom is basically built from \left
–\right
(and contains a subformula). Over results from \overline
and Under from \underline
. Acc from the primitive \mathaccent
that's called by commands such as \bar
or \tilde
. Rad stems from the \radical
primitive, internally used by \sqrt
. Vcent is a special object built from \vcenter
.
An Op atom can be followed by the commands \displaylimits
, \limits
or \nolimits
; no specification is equivalent to adding \displaylimits
: the subscript and superscript fields will be typeset below and above the operator when the formula itself is typeset in display styles (from $$...$$
or, in LaTeX parlance, \[...\]
or similar environments) or besides the symbol in the other styles. There are also rules for possibly choosing a bigger version of the symbol in display style.
Any symbol or subformula can be made into an atom by specifying it as argument to \mathord
, \mathop
, \mathrel
, \mathbin
, \mathopen
, \mathclose
, \mathpunct
or \mathinner
. However \mathord{...}
is equivalent to the simpler {...}
.
Your particular question is about \bar
and \overline
. Something like \bar{abc}
becomes (temporarily) an Acc atom; the accent is placed above the whole subformula, but has no wider version, so it ends up covering just the b
. With \widetilde
it is different, because the \mathaccent
command points to a glyph that has wider variants (this information is encoded in the font). With \overline{abc}
, instead, a rule is drawn above the whole subformula, making a single Over
atom (that will be later considered as Ord as far as spacing is concerned).
After the input is processed assigning atom types according to internal tables that assign \sum
to being Op, =
as being Rel and so on, the whole math list so obtained is reprocessed in order to add the suitable math spacings after transforming Over, Under, Acc, Rad and Vcent atoms to Ord; it is then processed again in order to transform it into “boxes and glue”.
The whole Appendix G in the TeXbook is devoted to the rules for such processing.
The main concept that controls the math spacing is the math class. Compare the two expressions below
In the first every atom has mathclass 0 (\mathord
) so gets no special spacing.
In the second, operators are specified with \mathop
, infix binary operators are marked with \mathbin
and relations aare marked with \mathrel
and you see the classic TeX spacing.
\documentclass{article}
\begin{document}
\[{X}_{0}^{n}{+}\mathrm{cos}{x}{=}0\]
\[\mathop{X}_{0}^{n}\mathbin{+}\mathop{\mathrm{cos}}{x}\mathrel{=}0\]
\end{document}
Of course you do not normally have to classify symbols by hand like this, for example
=
is declared in latex by
\DeclareMathSymbol{=}{\mathrel}{operators}{"3D}
so by default it is \mathrel
similarly
\DeclareMathSymbol{+}{\mathbin}{operators}{"2B}
declares that by default + is a \mathbin
and \cos
is defined by
\def\cos{\mathop{\operator@font cos}\nolimits}
so if you use \cos
rather than \mathrm{cos}
then you get the extra operator spacing.