pdfTeX hang prevention
The issue can be reproduced with a smaller example, showing it has nothing to do with the loaded packages:
\documentclass{article}
\begin{document}
$\left. \begin{array} { l } { a ) A = \{ 2 $
\section{Solution}
${a: 0}$
\end{document}
The errors in the math formula, followed by the section title, make TeX enter an infinite loop announced by
! LaTeX Error: \begin{array} on input line 7 ended by \end{document}.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.12 \end{document}
?
! Improper \prevdepth.
\newpage ...everypar {}\fi \par \ifdim \prevdepth
>\z@ \vskip -\ifdim \prevd...
l.12 \end{document}
The missing \end{array}
causes \par
to still be defined as “do nothing” like it always is in array
. Since the error recovery here is to try doing \end{document}
, LaTeX tries to finish up the page issuing \par
, which does nothing.
If we add \tracingmacros=1
, after the last error message we see, in the log file after interrupting the program, a string of
\par ->
\par ->
\par ->
\par ->
\par ->
Solution: don't make silly errors in your input.
Another solution could be running pdflatex
with the option -halt-on-error
, which would stop it at the first error.
However, this is not foolproof. If the user has \def\foo{\foo}
in their preamble, then the first usage of \foo
in the document would start an infinite loop with no error.
This is an answer to the actual problem of running TeX as a subprocess on a document that contains user supplied code that you have no control over (rather than focusing on the particular example you've provided).
As already mentioned by others, it's trivially easy to trigger an infinite loop in TeX without generating any errors. Your example shows a plausible user mistake (forgetting the end of an environment) but you also need to guard against a malicious user deliberately triggering an infinite loop.
Whenever you have an application or script that spawns a subprocess that has the potential to run indefinitely it's a good idea to include a timeout. Since you're using Python, you might find the answers to Using module 'subprocess' with timeout useful.
There are, however, other types of malicious code that you need to consider. There were some significant improvements made in both TeX Live and MikTeX in 2010 to improve security, but there have also been some more recent fixes, such as:
- Buffer overflow in
texlive-bin
allowed arbitrary code execution when a malicious Type 1 font is loaded. - Incorrect handling of certain files in TeX Live on Ubuntu 14.04 LTS
So make sure you have an up-to-date TeX distribution.
The security settings for TeX Live are in the texmf.cnf
configuration file. There are two of these files by default and their locations can be found with kpsewhich -a texmf.cnf
. One contains the default settings that shouldn't be modified. The other can be used to override specific settings if required.
The security settings for MikTeX are in the miktex.ini
file.
The main source for concern is the shell escape (\write18
). There are three modes:
- Disabled (
shell_escape=f
in thetexmf.cnf
file or use-no-shell-escape
when running TeX). This will prevent any systems commands from being called by TeX. This is the most secure mode. - Restricted (
shell_escape=p
in thetexmf.cnf
file). This imposes the following restrictions on\write18
:- Certain characters are forbidden (such as
'
and;
) to prevent injection. - Only applications on the trusted list can be run. These are identified in the
shell_escape_commands
setting in thetexmf.cnf
file. You can list them withkpsewhich -var-value=shell_escape_commands
There are currently eight:bibtex
,bibtex8
,extractbb
,gregorio
,kpsewhich
,makeindex
,repstopdf
,texosquery-jre8
. These have been evaluated by the TeX Live security team and determined to be safe. (It is, however, possible to still misuse this setting with destructive effect, as I recently demonstrated in the UK TUG meeting.)
- Certain characters are forbidden (such as
- Unrestricted (
shell_escape=t
in thetexmf.cnf
file or use-shell-escape
when running TeX). This allows any system command to be called and is therefore insecure.
Another area of concern are the file I/O operations, which are essential to common document build requirements (such as generating table of contents, cross-referencing and indexes) but can be misused. In addition to the operating system's native file permissions, TeX also has settings to determine whether read or write access is allowed.
The texmf.cnf
file has two settings openin_any
and openout_any
that may take one of the following values:
a
: any file allowed (if permitted by the operating system);r
: (restricted) hidden dot files not allowed;p
: (paranoid) hidden dot files not allowed, and disallow going to parent directories (..
) and restrict absolute paths to be under$TEXMFOUTPUT
.
The default values are:
openin_any = a
openout_any = p
The paranoid setting prevents files from being acessed outside of the current working path (the directory that TeX was called from).
For example, suppose you are running TeX on a web server and suppose your home directory on that server is /home/foo
and the root for your website is /home/foo/public_html
(so, for example, if your website is www.example.com
then www.example.com/index.php
corresponds to the file /home/foo/public_html/index.php
).
If you run TeX from your home directory (/home/foo
) then, even with the paranoid setting, malicious code added to your document can overwrite public_html/index.php
(if it's not protected by the filing system). Your website's home page is now corrupted.
With the file read operation, if the user gets to see the generated PDF, they can use malicious TeX code to access information from your system. Suppose you have a script /home/foo/public_html/foobar.php
that accesses a database. This could be input verbatim into the document and the database connection information, including the password, can now be read from the PDF.
TeX code can be obfuscated so don't rely on using regular expressions to check for certain commands within the user-supplied code.
Summary:
- Ensure you have an up-to-date TeX installation.
- Invoke TeX with a timeout that will automatically kill the process if it goes on too long.
- Run TeX with
-no-shell-escape
. - Run TeX in a safe directory that doesn't have any subdirectories leading to important files.
- Ensure that both
openout_any
andopenin_any
are set top
. - If you need to view the generated PDF, make sure that your PDF viewer has JavaScript disabled.
You have unbalanced environments/braces; \begin{array}
doesn't have \end{array}
and \left.
doesn't have \right..
. Also, load breqn
after amsmath
and add lmodern
for preventing missing font sizes substitution.
\documentclass{article}
\usepackage{graphicx,lmodern}
\usepackage{draftwatermark}
\usepackage{amsmath}
\usepackage{breqn}
\SetWatermarkText{FAST MATH}
\SetWatermarkScale{2}
\SetWatermarkVerCenter{0.6\paperheight}
\SetWatermarkAngle{30}
\begin{document}
\section{Input}
$ \begin{array} { l } a) A = \{ 2 \end{array}$
\section{Solution}
${a: 0}$
\end{document}