Redefining `---`
I still owe the answer I (more or less) promised in order to show how this issue could be addressed by means of patches to the font metric data. I have put it off until now because the situation we want to cope with is fairly complex and intricate, and deserves a careful explanation.
General Discussion
The fact that the input ---
is translated into an em-dash is a feature of the specific font you are using: it is exactly the same mechanism that turns the input fi
into the “fi” ligature. So, the question seems to be about how to change the look intentionally given to a font by its designer; but, as Ulrike Fischer notes in this comment to the question Changing “-” to \textendash
, “the general advice if one doesn’t like the look of a font is: use another font”.
Note that the glyph that represents a ligature could also be accessed directly via a suitably defined control sequence (or via the \symbol
LaTeX command, or the \char
TeX primitive); in our case, the glyph for the em-dash is also accessible through the control sequence (or LICR name) \textemdash
, and it is not clear from the question whether it should be replaced also when it is referred to in this way, or only when the source contains the string ---
.
Always speaking of ligatures, it should be remarked that there are well-known methods to selectively disable a given ligature, among which I’d like to recall the following two:
The
selnolig
package allows for context-based (as opposed to document-wide) suppression of ligatures. Note, however, that it requires LuaTeX as the typesetting engine in order to deploy any significant functionality.On the other hand, pdfTeX provides a primitive (
\tagcode
) that permits to selectively disable certain properties of a given character, among which is its ability to form ligatures; it does not permit, however, context-based disablement.
But what this question asks for is not simply to disable a ligature, but to replace it with a different character combination, and neither of the two aforementioned methods makes any provision for attaining this: indeed, this entails rewriting the ligature program. It is for this reason that I’m taking the trouble to write this answer: it might be of some interest to show how the ligature programs of a font can be modified.
Other Possible Solutions Unrelated to Font Metrics
I’ll consider two possibilities: making the -
character active and using a different input coding; but I’ll only touch upon them, because they are not the subject of this answer.
Turning -
into an active character
An active character is a character that behaves like a macro: it can be assigned a definition, whose contents replace every later occurrence of that character. In this case, the idea is to define -
as a macro that looks ahead to see if other two identical characters follow, and in that case substitutes the desired character combination for the ---
triplet. Unfortunately, it is extremely awkward to accomplish this with the -
character, because it also plays a syntactic rôle in numeric expressions used to communicate with TeX itself (e.g., \setcounter{FOO}{-1}
). In this respect, note that ---1
is a perfectly legal instance of a <number>
. It should be remarked, too, that this solution would not affect the output produced by \textemdash
, which, as said above, might or might not be what is wanted.
Using a different input coding
Since computers are notoriously dumb, it’s always best, when dealing with them, to use different symbols to convey different meanings. By means of the inputenc
package, you can simply write an em-dash in your source to indicate an em-dash in the output: the package will automatically translate it into a \textemdash
command, with no possible confusion with the -
symbol used for the arithmetic of TeX. For this reason, I deem that Manuel’s answer is the best solution of all.
My Solution Based on Font Metrics: A Proof of Concept
TeX finds directions on how to build ligatures from the characters of a given font inside the TFM (TeX Font Metric) file of the font itself, in the form of a collection of so-called ligature programs. Ligature programs are best written during the design phase of a font, e.g., at the level of METAFONT sources, for fonts designed using that language; but it is also possible to fiddle with them at a later stage, if—as it may well happen—the design phase is not accessible to you. It suffices to convert the relevant TFM file into a PL (Property List) file, which is its exact human-readable counterpart, make the desired changes in this PL file and then “compile” it back into a TFM file. If you give your modified TFM file the same name as the original one and place it, for example, in the same directory as the TeX source you want to compile, with a “bit of luck” (see below for details) TeX will load it instead of the original one, thereby following your ligature programs. The same will happen if you place it in your personal, or local (machine-wide) texmf tree: see the documentation of your distribution to learn about the appropriate location. For example, for MacTeX the relevant directories are
~/Library/texmf/fonts/tfm/
(personal texmf tree), or
/usr/local/texlive/texmf-local/fonts/tfm/
(local texmf tree).
I’d like to stress once again, though, that a modification to the ligature program of a font obviously can neither change the output of \textemdash
, nor of an em-dash written verbatim in the source file and interpreted via the inputenc
package. So, from now on I’ll assume that the question being asked is only about how to change the output of the literal input sequence ---
(which is what the question says, after all).
Patching a ligature program
Let’s directly look at a practical example: we shall modify the ligature program of the cmr12
font in such a way that ---
produces two en-dashes separated by a very thin space (0.9pt). The reason for which we choose cmr12
, and not, say, cmr10
for this first example will become clear below.
Create a new directory for this example and move to it. Then type
tftopl cmr12.tfm cmr12.pl
on the command line. This generates, in the current directory, the file cmr12.pl
. Note that the tftopl
utility uses the same search paths as TeX for finding TFM files, so it should automatically have picked up the original one.
Open the cmr12.pl
file you’ve just created; it’s pretty long, but the first lines should read as follows
(FAMILY CMR)
(FACE O 346)
(CODINGSCHEME TEX TEXT)
(DESIGNSIZE R 12.0)
(COMMENT DESIGNSIZE IS IN POINTS)
(COMMENT OTHER SIZES ARE MULTIPLES OF DESIGNSIZE)
(CHECKSUM O 13052650413)
(FONTDIMEN
(SLANT R 0.0)
(SPACE R 0.3263855)
(STRETCH R 0.163193)
(SHRINK R 0.108795)
(XHEIGHT R 0.430556)
(QUAD R 0.9791565)
(EXTRASPACE R 0.108795)
)
(LIGTABLE
The (LIGTABLE
line marks the beginning of the collection (or table) of ligature programs. We cannot describe their format here, but in order to get the information necessary to read on, you may type
texdoc pltotf
on the command line and read section 13 of the document that is displayed. Be aware, though, that it is rather terse (it’s only about half a page long).
The hyphen character has octal code 55, and you’ll find its associated ligature program a few lines below:
(LABEL O 55)
(LIG O 55 O 173)
(STOP)
It dictates that, if another identical character follows, both character should be replaced by the character found in slot Octal 173; this is the en-dash. The new character is subject in turn to ligature substitution: indeed, we find the relevant ligature program immediately below.
(LABEL O 173)
(LIG O 55 O 174)
(STOP)
It says that if an en-dash (Octal 173) is followed by hyphen (Octal 55), the pair is substituted with an em-dash (Octal 174). Hence, we see that this is exactly the single program we need to change. A possible patch is the following (you should substitute the following four lines in place of the above three ones):
(LABEL O 173)
(/LIG O 55 O 173)
(KRN O 173 R 0.075)
(STOP)
Now the LIG
instruction is written as (/LIG O 55 O 173)
: the initial slash prescribes that the character this program applies to be not deleted, and this is the reason for the ensuing KRN
instruction, which causes two adjacent en-dashes (even if the second one has just been added by a ligature instruction) to be separated by a positive kern of 0.075 times the design size of the font, that is, in the case of this font, of 0.9pt.
There’s only one more change you should apply to the file before saving it: go to the top and delete the line that reads
(CHECKSUM O 13052650413)
This will instruct the pltotf
program to recompute the checksum, taking our modifications into account. You can now save the modified file, preferably under a new name, for example noemdlig-cmr12.pl
.
“Compiling” and using the new TFM file
After editing the PL file, you need to “compile” it into a TFM file; you do so by giving the command
pltotf noemdlig-cmr12.pl cmr12.tfm
Note that the pltotf
program should both load the PL file from, and save the TFM file to, the current directory.
OK, let’s now try it: save the following test LaTeX source in the same directory as the modified cmr12.tfm
file
\documentclass[12pt,% This is crucial!
a4paper% This is not, of course!
]{article}
% Note that you must *not* change the default OT1 encoding!
\newcommand*\myFontID{}
\begin{document}
We are now using the font\edef\myFontID{\the\font}
\texttt{\fontname\myFontID},
identified internally by the control sequence
\texttt{\expandafter\string\myFontID}.
This is an em-dash formed by a ligature:%
~---\spacefactor\sfcode`. \space
This, on the other hand, is an em-dash produced by
\verb|\textemdash|:~\textemdash
\end{document}
and compile it; with a bit of luck (see next subsection) you will get the following output
in which you can see that the first em-dash has been replaced by two en-dashes separated by a thin space.
The problem of preloaded fonts
You are probably wondering why we insisted that the 12 point size of the CMR font were used in the above example. Well, let's try to repeat the whole process with the cmr10
font. Create a new directory, move to it, type
tftopl cmr10.tfm cmr10.pl
on the command line, so to obtain, in the current directory, the file cmr10.pl
; modify this file in the same way as above and save it under the name noemdlig-cmr10.pl
; then “compile” the new metrics file by issuing the command
pltotf noemdlig-cmr10.pl cmr10.tfm
However, if you do so, and then try to compile the same .tex
source shown above, with the sole modification of the 12pt
option being replaced by 10pt
, you won’t get the expected outcome: the em-dash produced by —
stays a single, solid em-dash. Why does this happen?
The reason is that the LaTeX format “incorporates”, as it were, some font that has been preloaded during the construction of the format itself. All information pertaining to these preloaded fonts, including ligature programs, is already present into the “frozen” image of TeX’s memory that the format records, because it has already been read when the format was constructed. TeX does never load the same TFM file more than once, so for preloaded fonts it is simply impossible to change the original ligatures. Under the customary settings, the cmr10
font is preloaded, but cmr12
is not: this is why our first example worked, but the second did not. Note, however, that the set of fonts that are preloaded into the LaTeX format is one of the few aspects that site maintainers are permitted to customize, so the settings might be different in your particular LaTeX installation: this is the reason for which, above, we said that the 12-point example required “a bit of luck” to work.
As said, there is no way around this, apart from using a different name for the font. For example, we could compile the modified PL file into a TFM file by the same name
pltotf noemdlig-cmr10.pl noemdlig-cmr10.tfm
and then instruct LaTeX to use this TFM file instead of cmr10.tfm
:
\documentclass[10pt,% This is crucial!
a4paper% This is not, of course!
]{article}
% Define a new font family called "modcmr", and specify that
% "noemdlig-cmr10.tfm" is the TFM file associated with the font from this
% family in medium weight, normal shape, and 10 point size:
\DeclareFontFamily{OT1}{modcmr}{}
\DeclareFontShape{OT1}{modcmr}{m}{n}{<10> noemdlig-cmr10}{}
% Tell pdfTeX where to find the corresponding PFB file:
\pdfmapline{+noemdlig-cmr10 CMR10 <cmr10.pfb}
% Set "modcmr" as the default seriffed family
% (BEWARE, THIS WON'T WORK IN GENERAL!):
\renewcommand{\rmdefault}{modcmr}
\newcommand*\myFontID{}
\begin{document}
We are now using the font\edef\myFontID{\the\font}
\texttt{\fontname\myFontID},
identified internally by the control sequence
\texttt{\expandafter\string\myFontID}.
This is an em-dash formed by a ligature:%
~---\spacefactor\sfcode`. \space
This, on the other hand, is an em-dash produced by
\verb|\textemdash|:~\textemdash
\end{document}
Two points in this code are particularly noteworthy:
The file is specifically intended for the pdfTeX engine. The line
\pdfmapline{+noemdlig-cmr10 CMR10 <cmr10.pfb}
is necessary to tell pdfTeX where the glyphs of our new font should be taken from; this will not work, e.g., when producing DVI output with
latex
. Another possible approach, which dispenses with the\pdfmapline
line and can therefore be applied with any engine, is to use a virtual font, that “knows” by itself where to find its own glyphs; however, this entails creating an additional file (a VF, or “Virtual Font”, file) that must subsequently be read by TeX at every compilation, so this alternative is less efficient both in terms of space and time.The line
\renewcommand{\rmdefault}{modcmr}
instructs LaTeX to use our
modcmr
family as the default seriffed family for all weights, shapes, and sizes. In order for it to work correctly, therefore, we should supply modified versions not only of thecmr
font, but also ofcmbx
,cmsl
, andcmti
, for all the available sizes!
In particular, the second point shows that, in practice, the problem of preloaded fonts is not limited to those shapes and sizes that are actually preloaded, but propagates to the whole families of which those shapes and sizes are part. In other words, there’s no efficient way of solving it…
My Solution Based on Font Metrics: Working Examples
Can the strategy outlined above be applied in any real-life example? Yes and no, I’d reply; the following two examples should illustrate what I mean.
A reasonably sized example
The first example uses the times
package. Since the Times font is a scalable font, in the setting provided by this package, for each given font shape a single TFM file is used at all sizes. It turns out that just eight TFM files need to be patched, and this is why this example can still be deemed reasonable. Eight files are already enough, however, to suggest that we have recourse to a shell script.
Create a new directory for this example and put the following TeX source into it:
\documentclass[a4paper]{article} % To avoid confusion, let us explicitly
% declare the paper format.
\usepackage[T1]{fontenc} % This example uses T1 encoding; using OT1
% encoding is also possible, but requires some
% other modifications besides removing (or
% changing) this declaration.
\usepackage{times} % This example specifically uses the fonts
% selected by this package.
\newcommand*\myFontID{}
\DeclareRobustCommand*\command[1]{%
\texttt{\char\escapechar #1}%
}
\newcommand*\testtext{%
\par
We are now using the font\edef\myFontID{\the\font}
\texttt{\fontname\myFontID}
(external name, optionally followed by
\texttt{at} size at which it is loaded),
identified internally by the control sequence
\texttt{\expandafter\string\myFontID}.
This is an em-dash formed by a ligature:%
~---\spacefactor\sfcode`. \space
This, on the other hand, is an em-dash produced by
\command{textemdash}:~\textemdash
\par
}
\newcommand*\fulltest[1]{%
\subsection{Test for \command{#1}}%
\begingroup
\csname #1\endcsname
\parskip \smallskipamount
\tolerance 9000
\hbadness 9000
\emergencystretch 3em
%
\normalfont
%
\upshape\testtext
\scshape\testtext
\itshape\testtext
\slshape\testtext
%
\normalfont\bfseries
%
\upshape\testtext
\scshape\testtext
\itshape\testtext
\slshape\testtext
\endgroup
}
\renewcommand*{\thesubsection}{\arabic{subsection}}
\begin{document}
\tableofcontents
\fulltest{normalsize}
\fulltest{small}
\fulltest{footnotesize}
\fulltest{scriptsize}
\fulltest{large}
\fulltest{Large}
\end{document}
Compile the above code and look at the output: the em-dashes are all there. Now we’ll make our patch.
Save the following shell script in the same directory:
#! /bin/bash
# Constants
declare -r kern_between_dashes='0.075'
# Modify the above value .......^^^^^
# according to your taste: it represents the amount of space inserted
# between the two en-dashes that take the place of what would have been
# the "original" em-dash, expressed in units equal to the design size of
# the font; for example, for a font with a design size of 10 points, the
# value "0.075" corresponds to a space of 0.75pt between the dashes.
declare -r new_vp_prefix='noemdlig-'
declare -r vfont_banner_1='Font '
declare -r vfont_banner_2=' with modified em-dash ligature'
declare -r flag_change_comment='MODIFIED LIGATURE FOLLOWS'
declare -ri err_none=0
declare -ri err_bad_line_in_vpl=-1
declare -ri err_vpl_not_created=-2
declare -r errMess_no_error='no error'
declare -r errMess_bad_line_in_vpl='ligature program is not as expected'
declare -r errMess_vpl_not_created='could not create VPL file'
declare -r errMess_generic_error='unknown error'
declare -r warnMess_no_em_lig="this font hasn't got an em-dash ligature"
# Functions
function PrintMessage () {
{ echo "`basename $0`: $1"; } >&2
}
function PrintWarningMessage () {
PrintMessage "warning in font $1: $2."
}
function PrintErrorMessage () {
PrintMessage "*ERROR* in font $1: $2."
}
function Error () {
case $2 in
($err_none)
PrintErrorMessage "$1" "$errMess_no_error"
;;
($err_bad_line_in_vpl)
PrintErrorMessage "$1" "$errMess_bad_line_in_vpl"
;;
($err_vpl_not_created)
PrintErrorMessage "$1" "$errMess_vpl_not_created"
;;
(*)
PrintErrorMessage "$1" "$errMess_generic_error"
;;
esac
exit $2
}
function VFtoVP () {
vftovp "$1.vf" "$1.tfm" "$1.vpl"
}
function VPtoVF () {
vptovf "${new_vp_prefix}$1.vpl" "$1.vf" "$1.tfm"
}
function PatchVPLFile () {
local LINE
local -i bad_line=0
local -i lig_seen=0
{
while IFS= read -r LINE ; do
if [ "${LINE:0:7}" == '(VTITLE' ] ; then # ) paren match
echo "(VTITLE $vfont_banner_1$1$vfont_banner_2)"
elif [ "${LINE:0:9}" == '(CHECKSUM' ] ; then # ) paren match
# gobble line
:
elif [ "${LINE:0:15}" == ' (LABEL O 25)' ] ; then
# check next two lines
if IFS= read -r LINE && [ "${LINE:0:18}" == ' (LIG O 55 O 26)' ] ; then
if IFS= read -r LINE && [ "${LINE:0:9}" == ' (STOP)' ] ; then
lig_seen=1
echo " (COMMENT $flag_change_comment)"
echo ' (LABEL O 25)'
echo ' (/LIG O 55 O 25)'
echo " (KRN O 25 R $kern_between_dashes)"
echo ' (STOP)'
else
bad_line=1
fi
else
bad_line=1
fi
else
# hand line on, unchanged
echo "$LINE"
fi
done
} <"$1.vpl" >"${new_vp_prefix}$1.vpl"
[ $bad_line -eq 1 ] && return $err_bad_line_in_vpl
[ $lig_seen -ne 1 ] && PrintWarningMessage "$1" "$warnMess_no_em_lig"
return $err_none
}
function Process1Font () {
local -i last_err
# ensure removal of local copies in case called twice
rm -f "./$1.vf" "./$1.tfm"
VFtoVP "$1" && PatchVPLFile "$1" && VPtoVF "$1"
last_err=$?
if [ $last_err -ne $err_none ] ; then
Error "$1" $last_err
else
return $err_none
fi
}
# Main
Process1Font ptmr8t
Process1Font ptmrc8t
Process1Font ptmri8t
Process1Font ptmro8t
Process1Font ptmb8t
Process1Font ptmbc8t
Process1Font ptmbi8t
Process1Font ptmbo8t
Run the script, then re-compile the above LaTeX source and look again at the result.
Note that the ptm
fonts are virtual fonts, so we needed to deal with VF (Virtual Font) files as well as with TFM files, and to replace PL files with VPL (Virtual Property List) files.
An unreasonably sized example
If a different TFM file is needed for each different size, the number of files that need to be patched can easily rise to hundreds. Here we use the EC version of Computer Modern in order to avoid the problem of preloaded fonts (see above).
Create a new directory for this example and save the following source into it:
\documentclass[a4paper]{article}
\usepackage[T1]{fontenc}
\newcommand*\myFontID{}
\DeclareRobustCommand*\command[1]{%
\texttt{\char\escapechar #1}%
}
\newcommand*\testtext{%
\par
We are now using the font\edef\myFontID{\the\font}
\texttt{\fontname\myFontID}
(external name, optionally followed by
\texttt{at} size at which it is loaded),
identified internally by the control sequence
\texttt{\expandafter\string\myFontID}.
This is an em-dash formed by a ligature:%
~---\spacefactor\sfcode`. \space
This, on the other hand, is an em-dash produced by
\command{textemdash}:~\textemdash
\par
}
\newcommand*\fulltest[1]{%
\subsection{Test for \command{#1}}%
\begingroup
\csname #1\endcsname
\parskip \smallskipamount
\tolerance 9000
\hbadness 9000
%
\normalfont
%
\upshape\testtext
\scshape\testtext
\itshape\testtext
\slshape\testtext
%
\normalfont\bfseries
%
\upshape\testtext
\scshape\testtext
\itshape\testtext
\slshape\testtext
%
\usefont{T1}{cmr}{b}{n} \testtext
\usefont{T1}{cmr}{m}{ui}\testtext
\endgroup
}
\renewcommand*{\thesubsection}{\arabic{subsection}}
\begin{document}
\tableofcontents
\fulltest{normalsize}
\fulltest{small}
\fulltest{footnotesize}
\fulltest{scriptsize}
\fulltest{large}
\fulltest{Large}
\end{document}
Compile it and look at the output. Then save the following shell script in the same directory:
#! /bin/bash
# Constants
declare -r kern_between_dashes='0.075'
# Modify the above value .......^^^^^ according to your taste.
declare -r new_pl_prefix='noemdlig-'
declare -r flag_change_comment='MODIFIED LIGATURE FOLLOWS'
declare -ri err_none=0
declare -ri err_bad_line_in_pl=-1
declare -ri err_pl_not_created=-2
declare -r errMess_no_error='no error'
declare -r errMess_bad_line_in_pl='ligature program is not as expected'
declare -r errMess_pl_not_created='could not create PL file'
declare -r errMess_generic_error='unknown error'
declare -r warnMess_no_em_lig="this font hasn't got an em-dash ligature"
# Functions
function PrintMessage () {
{ echo "`basename $0`: $1"; } >&2
}
function PrintWarningMessage () {
PrintMessage "warning in font $1: $2."
}
function PrintErrorMessage () {
PrintMessage "*ERROR* in font $1: $2."
}
function Error () {
case $2 in
($err_none)
PrintErrorMessage "$1" "$errMess_no_error"
;;
($err_bad_line_in_pl)
PrintErrorMessage "$1" "$errMess_bad_line_in_pl"
;;
($err_pl_not_created)
PrintErrorMessage "$1" "$errMess_pl_not_created"
;;
(*)
PrintErrorMessage "$1" "$errMess_generic_error"
;;
esac
exit $2
}
function TFtoPL () {
tftopl "$1.tfm" "$1.pl" 2> /dev/null
}
function PLtoTF () {
pltotf "${new_pl_prefix}$1.pl" "$1.tfm"
}
function PatchPLFile () {
local LINE
local -i bad_line=0
local -i lig_seen=0
{
while IFS= read -r LINE ; do
if [ "${LINE:0:9}" == '(CHECKSUM' ] ; then # ) paren match
# gobble line
:
elif [ "${LINE:0:15}" == ' (LABEL O 25)' ] ; then
# check next two lines
if IFS= read -r LINE && [ "${LINE:0:18}" == ' (LIG O 55 O 26)' ] ; then
if IFS= read -r LINE && [ "${LINE:0:9}" == ' (STOP)' ] ; then
lig_seen=1
echo " (COMMENT $flag_change_comment)"
echo ' (LABEL O 25)'
echo ' (/LIG O 55 O 25)'
echo " (KRN O 25 R $kern_between_dashes)"
echo ' (STOP)'
else
bad_line=1
fi
else
bad_line=1
fi
else
# hand line on, unchanged
echo "$LINE"
fi
done
} <"$1.pl" >"${new_pl_prefix}$1.pl"
[ $bad_line -eq 1 ] && return $err_bad_line_in_pl
[ $lig_seen -ne 1 ] && PrintWarningMessage "$1" "$warnMess_no_em_lig"
return $err_none
}
function Process1FontSize () {
local -i last_err
# ensure removal of local copies in case called twice
rm -f "./$1.pl" "./$1.tfm"
TFtoPL "$1" && PatchPLFile "$1" && PLtoTF "$1"
last_err=$?
if [ $last_err -ne $err_none ] ; then
Error "$1" $last_err
else
return $err_none
fi
}
function Process1Font () {
local csiz
for csiz in '0500' '0600' '0700' '0800' '0900' '1000' \
'1095' '1200' '1440' '1728' \
'2074' '2488' '2986' '3583' ; do
Process1FontSize "$1${csiz}"
done
}
# Main
for font in 'bi' 'bl' 'bx' 'cc' 'ci' 'oc' 'rb' 'rm' 'sc' 'si' \
'sl' 'so' 'ss' 'sx' 'ti' 'ui' 'xc' ; do
Process1Font "ec${font}"
done
Run the script (note that this time it will take a minute or so), re-compile, and look at the output again.
Enough!
You could simply replace all instances of ---
in the tex file with --{}--
or, if you prefer, \textendash\textendash{}
.
In the Computer Modern font family, the outputs of --{}--
and ---
will be visually indistinguishable as an em-dash is exactly twice as wide as an en-dash.
If using LuaLaTeX is an option for you, it's straightforward to set up a small function that replaces all instances of ---
with \textendash\textendash{}
"on the fly", before TeX starts is usual processing:
\documentclass{article}
\usepackage{luacode}
\begin{luacode}
function em2en ( buff )
return ( string.gsub ( buff, "%-%-%-", "\\textendash\\textendash{}" ) )
end
luatexbase.add_to_callback( "process_input_buffer", em2en, "em2en" )
\end{luacode}
\begin{document}
--- vs.\ \textendash\textendash
\end{document}
Addendum: If the objective includes having a tiny bit of white-space between the pair of en-dashes, it suffices to modify the MWE as follows. (The amount of spacing used here is \kern0.75pt
; feel free to change the width as you see fit.)
\documentclass{article}
\usepackage{luacode}
\begin{luacode}
function em2en ( buff )
return ( string.gsub ( buff, "%-%-%-", "\\textendash\\kern0.75pt\\textendash{}" ) )
end
luatexbase.add_to_callback( "process_input_buffer", em2en, "em2en" )
\end{luacode}
\begin{document}
---and\textendash\kern0.75pt\textendash
\end{document}
Mixing a little bit of other answers. I would use utf8 characters, and I would use them correctly, hyphen when an hyphen is there, en-dash when an en-dash is there, and em-dash when an em-dash is there. And then change the output of those symbols.
Personally, I would load
\usepackage[utf8]{inputenc}
and then make a search & replace of ---
into —
(em-dash) and --
into –
(en-dash). Then I would redefine
\let\textemdash\relax
\DeclareRobustCommand\textemdash{\textendash\textendash}
And use each one in its place correctly making the ouput different.
\documentclass{scrartcl}
\usepackage[utf8]{inputenc}
\protected\def\textemdash{\textendash\textendash}
\begin{document}
This is an em-dash — and this is two en-dashes ––.
\end{document}
I'm looking forward to the answer of Gustavo Mezzeti.