How best to embed various cell groups into a $\LaTeX$ project?

Introduction

Below I present usage of my CellsToTeX package.

It provides functions for converting Mathematica cells to $\TeX$ code compatible with $\TeX$ package mmacells. Compilation of this $\TeX$ code results in output resembling FrontEnd appearance of converted cells. Full capabilities of $\TeX$ package are described in my post in "Fanciest way to include Mathematica code in LaTeX" thread on TeX StackExchange.

Converted code preserves formatting and has special annotations reflecting colorization of identifiers. The latter feature is provided by SyntaxAnnotations package, described in an answer to "How to convert a notebook cell to a string retaining all formatting, colorization of identifiers etc?" question.


Usage examples

Import package without installation:

Import["https://raw.githubusercontent.com/jkuczm/MathematicaCellsToTeX/master/NoInstall.m"]

Individual cells

Default conversion of "Input" cell preserves formatting:

testCell = Cell[BoxData[MakeBoxes[Subscript[x, 1] == (-b \[PlusMinus] Sqrt[b^2 - 4 a c])/(2 a)]], "Input"];
testCell // CellPrint
CellToTeX[testCell]
\begin{mmaCell}{Input}
  \mmaSub{x}{1}==\mmaFrac{-b\(\pmb{\pm}\)\mmaSqrt{\mmaSup{b}{2}-4 a c}}{2 a}
\end{mmaCell}

Same cell converted to "Code" $\TeX$ cell. By default this conversion changes boxes to InputForm:

CellToTeX[testCell, "Style" -> "Code"]
\begin{mmaCell}{Code}
  Subscript[x, 1] == (-b \[PlusMinus] Sqrt[b^2 - 4*a*c])/(2*a)
\end{mmaCell}

Conversion of boxes with some colored symbols:

MakeBoxes[Table[Sin[x], {x, 10}]; Module[{x = 1, a}, a[y_] := x + y]] // DisplayForm
CellToTeX[%, "Style" -> "Code"]
\begin{mmaCell}[morefunctionlocal={x},morelocal={a},morepattern={y_, y}]{Code}
  Table[Sin[x], {x, 10}]; Module[{\mmaLoc{x} = 1, a}, a[y_] := \mmaLoc{x} + y]
\end{mmaCell}

Note that commonest syntax roles of symbols are set as environment's options, only non-commonest roles require code annotations.

Whole notebook

Let's start with creating an example notebook with some evaluated cells:

nbObj = CreateDocument[{
    Cell[BoxData@MakeBoxes[Solve[a x^2 + b x + c == 0, x]], "Input"],
    Cell[
        BoxData[{
            MakeBoxes[Module[{x = 3}, x + 2]],
            MakeBoxes[f[x_] := 2 x + 1],
            RowBox[{"Print", "[", 
            RowBox[{"\"Print a string with a fraction \"", ",", RowBox[{"a", "/", "b"}], ",", "\" inside\""}], "]"}],
            RowBox[{"1", "/", "0"}],
            RowBox[{RowBox[{"1", "+", RowBox[{"2", " ", "x"}]}], "//", "FullForm"}]
        }]
        ,
        "Input"
    ]
}];
(* Swich off auto deleting of labels, so that we can extract some data from them. *)
CurrentValue[nbObj, CellLabelAutoDelete] = False;
SelectionMove[nbObj, All, Notebook];
SelectionEvaluate[nbObj, Before];

Package default settings

Export above notebook to $\TeX$ using default settings:

SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
    NotebookGet[nbObj] /. cell : Cell[_, __] :> Cell[CellToTeX[cell], "Final"],
    "TeX",
    "FullDocument" -> False,
    "ConversionRules" -> {"Final" -> Identity}
]
\begin{mmaCell}[morefunctionlocal={x}]{Input}
  Solve[a \mmaSup{x}{2}+b x+c==0,x]
\end{mmaCell}

\begin{mmaCell}{Output}
  \{\{x\(\to\)\mmaFrac{-b-\mmaSqrt{\mmaSup{b}{2}-4 a c}}{2 a}\},\{x\(\to\)\mmaFrac{-b+\mmaSqrt{\mmaSup{b}{2}-4 a c}}{2 a}\}\}
\end{mmaCell}

\begin{mmaCell}[morelocal={x},moredefined={f},morepattern={x_}]{Input}
  Module[\{x=3\},x+2]
  f[x_]:=2 \mmaPat{x}+1
  Print["Print a string with a fraction ",a/b," inside"]
  1/0
  1+2 \mmaUnd{x}//FullForm
\end{mmaCell}

\begin{mmaCell}{Output}
  5
\end{mmaCell}

\begin{mmaCell}{Print}
  Print a string with a fraction \mmaFrac{a}{b} inside
\end{mmaCell}

\begin{mmaCell}[messagelink=message/General/infy]{Message}
  Power::infy: Infinite expression \mmaFrac{1}{0} encountered. >>
\end{mmaCell}

\begin{mmaCell}[addtoindex=2]{Output}
  ComplexInfinity
\end{mmaCell}

\begin{mmaCell}[form=FullForm]{Output}
  Plus[1,Times[2,x]]
\end{mmaCell}

Above $\TeX$ code results in following pdf: Print screen of pdf: default conversion Note that message link, in pdf, is clickable.

Example of customization

Convert "Input" cells to InputForm and export other cells as pdfs.

(* We'll be creating pdf files in notebook directory. *)
SetDirectory[NotebookDirectory[]];
(* Add CellsToTeX`Configuration` to $ContextPath to get easy access to all "processors". *)
PrependTo[$ContextPath, "CellsToTeX`Configuration`"];
SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
    NotebookGet[nbObj] /. {
        cell : Cell[_, "Input" | "Code", ___] :> Cell[CellToTeX[cell, "Style" -> "Code"], "Final"],
        cell : Cell[_, __] :> 
            Cell[CellToTeX[cell, "Processor" -> Composition[
                trackCellIndexProcessor, mmaCellGraphicsProcessor,
                exportProcessor, cellLabelProcessor, extractCellOptionsProcessor
            ]], "Final"]
    },
    "TeX",
    "FullDocument" -> False,
    "ConversionRules" -> {"Final" -> Identity}
]
\begin{mmaCell}[morefunctionlocal={x}]{Code}
  Solve[a*x^2 + b*x + c == 0, x]
\end{mmaCell}

\mmaCellGraphics{Output}{c6a8671c.pdf}

\begin{mmaCell}[morelocal={x},moredefined={f},morepattern={x_}]{Code}
  Module[{x = 3}, x + 2]
  f[x_] := 2*\mmaPat{x} + 1
  Print["Print a string with a fraction ", a/b, " inside"]
  1/0
  FullForm[1 + 2*\mmaUnd{x}]
\end{mmaCell}

\mmaCellGraphics{Output}{f86875a8.pdf}

\mmaCellGraphics{Print}{c2c36850.pdf}

\mmaCellGraphics{Message}{751e2ed3.pdf}

\mmaCellGraphics[addtoindex=2]{Output}{a88d6483.pdf}

\mmaCellGraphics[form=FullForm]{Output}{fdfe970a.pdf}

Above $\TeX$ code results in following pdf: Print screen of pdf: InputForm and included pdfs Note that you can copy code, from input cells in pdf, and paste it to Mathematica.

Mathematica built-in export

For comparison let's export same notebook using only Mathematica's built-in "TeX" export.

To be able to export message cell we first need to fix a bug:

If[FreeQ[Options[System`Convert`CommonDump`RemoveLinearSyntax], System`Convert`CommonDump`Recursive],
    DownValues[System`Convert`TeXFormDump`maketex] =
        DownValues[System`Convert`TeXFormDump`maketex] /.
            Verbatim[System`Convert`CommonDump`RemoveLinearSyntax][arg_, System`Convert`CommonDump`Recursive -> val_] :>
                System`Convert`CommonDump`RemoveLinearSyntax[arg, System`Convert`CommonDump`ConvertRecursive -> val]
];

Now we can export our example notebook:

ExportString[NotebookGet[nbObj], "TeX", "FullDocument" -> False]
\begin{doublespace}
\noindent\(\pmb{\text{Solve}\left[a x^2+b x+c==0,x\right]}\)
\end{doublespace}

\begin{doublespace}
\noindent\(\left\{\left\{x\to \frac{-b-\sqrt{b^2-4 a c}}{2 a}\right\},\left\{x\to \frac{-b+\sqrt{b^2-4 a c}}{2 a}\right\}\right\}\)
\end{doublespace}

\begin{doublespace}
\noindent\(\pmb{\text{Module}[\{x=3\},x+2]}\\
\pmb{f[\text{x$\_$}]\text{:=}2 x+1}\\
\pmb{\text{Print}[\text{{``}Print a string with a fraction {''}},a/b,\
\text{{``} inside{''}}]}\\
\pmb{1/0}\\
\pmb{1+2 x\text{//}\text{FullForm}}\)
\end{doublespace}

\begin{doublespace}
\noindent\(5\)
\end{doublespace}

\noindent\(\text{Print a string with a fraction }\frac{a}{b}\text{ inside}\)

\noindent\(\text{Power}\text{::}\text{infy}: \text{Infinite expression }\frac{1}{0}\text{ encountered. }\rangle\rangle\)

\begin{doublespace}
\noindent\(\text{ComplexInfinity}\)
\end{doublespace}

\begin{doublespace}
\noindent\(\text{Plus}[1,\text{Times}[2,x]]\)
\end{doublespace}

Above $\TeX$ code results in following pdf: Print screen of pdf: built-in conversion

Unicode

Let's start with listing some ways of transferring non-ASCII characters from Mathematica to outside world.

If we just copy something to clipboard Mathematica will convert non-ASCII characters to \[...] form, if we don't want that to happen we can use one of ways described in How to “Copy as Unicode” from a Notebook?.

We can also directly Export to a file using appropriate, for our case, encoding e.g. CharacterEncoding -> "UTF-8".

In CellsToTeX package there are two options useful in customizing handling of non-ASCII characters: "StringRules" and "NonASCIIHandler".

"StringRules" accepts list of rules used for replacing substrings with other substrings, so it can be used to directly replace certain character with something else.

Those non-ASCII characters that were not matched by "StringRules" will be handled by non-ASCII handler. "NonASCIIHandler" option accepts a function to which a String with non-ASCII character will be passed, it should return a String with "converted" character.

CellsToTeX package supports various different strategies for handling Unicode.

Let's create a test notebook with two cells contatining some non-ASCII characters:

testCells = {
    Cell[
        BoxData@MakeBoxes[Solve[a χ1^2 + β χ1 + γ == 0, χ1]],
        "Input"
    ]
    ,
    Cell[
        BoxData@MakeBoxes[{
            {χ1 -> (-β - Sqrt[β^2 - 4*a*γ])/(2* a)},
            {χ1 -> (-β + Sqrt[β^2 - 4*a*γ])/(2*a)}
        }],
        "Output"
    ]
};
testNb = Notebook[{Cell[CellGroupData[testCells, Open]]}];
% // NotebookPut;

Default

By default "Code" cells use "NonASCIIHandler" -> Identity which means that characters are unchanged by this conversion stage, but since it also uses "CharacterEncoding" -> "ASCII" non-ASCII characters will be converted to \[...] form.

Other cell styles, by default use charToTeX function in "NonASCIIHandler" option, which converts characters to corresponding $\TeX$ commands, "Input" cells use Bold variant which additionally wraps commands with \pmb{...}, "Output", "Print" and "Message" cells use Plain variant.

So default behavior is to always give pure ASCII result that will work in all $\TeX$ engines.

StringJoin@Riffle[CellToTeX /@ testCells, "\n\n"]
\begin{mmaCell}{Input}
  Solve[a \mmaSup{\mmaFnc{\(\pmb{\chi}\)1}}{2}+\mmaUnd{\(\pmb{\beta}\)} \mmaFnc{\(\pmb{\chi}\)1}+\mmaUnd{\(\pmb{\gamma}\)}==0,\mmaFnc{\(\pmb{\chi}\)1}]
\end{mmaCell}

\begin{mmaCell}{Output}
  \{\{\(\chi\)1\(\to\)\mmaFrac{-\(\beta\)-\mmaSqrt{\mmaSup{\(\beta\)}{2}-4 a \(\gamma\)}}{2 a}\},\{\(\chi\)1\(\to\)\mmaFrac{-\(\beta\)+\mmaSqrt{\mmaSup{\(\beta\)}{2}-4 a \(\gamma\)}}{2 a}\}\}
\end{mmaCell}

Print screen of pdf: Unicode default

Replacing Unicode at TeX level

Different strategy, which can be used with pdfTeX engine, is to use non-ASCII characters in $\TeX$ input and let $\TeX$ convert them to appropriate commands. On the level of mmacells package this is can be achieved using \mmaDefineMathReplacement command, in CellsToTeX those replacement can be gathered using texMathReplacementRegister function and appropriate \mmaDefineMathReplacement commands will be printed as part of preamble by CellsToTeXPreamble command.

Clear[texMathReplacement]
StringJoin@Riffle[
    Prepend[
        CellToTeX[#, "ProcessorOptions" -> {
            "StringRules" -> 
                Join[{"\[Equal]" -> "=="}, $stringsToTeX, $commandCharsToTeX],
            "NonASCIIHandler" ->
                (texMathReplacementRegister[Replace[#, "\[Rule]" -> "→"]] &)
        }] & /@ testCells,
        CellsToTeXPreamble[]
    ],
    "\n\n"
]
\mmaSet{morefv={gobble=2}}
\mmaDefineMathReplacement{β}{\beta}
\mmaDefineMathReplacement{γ}{\gamma}
\mmaDefineMathReplacement{χ}{\chi}
\mmaDefineMathReplacement{→}{\rightarrow}

\begin{mmaCell}{Input}
  Solve[a \mmaSup{\mmaFnc{χ1}}{2}+\mmaUnd{β} \mmaFnc{χ1}+\mmaUnd{γ}==0,\mmaFnc{χ1}]
\end{mmaCell}

\begin{mmaCell}{Output}
  \{\{χ1→\mmaFrac{-β-\mmaSqrt{\mmaSup{β}{2}-4 a γ}}{2 a}\},\{χ1→\mmaFrac{-β+\mmaSqrt{\mmaSup{β}{2}-4 a γ}}{2 a}\}\}
\end{mmaCell}

Print screen of pdf: Unicode default

Notice how we treated two private Unicode characters \[Equal] and \[Rule] differently. \[Equal] was simply converted to == using "StringRules". \[Rule] was converted to (\[RightArrow]) and still passed to texMathReplacementRegister.

Since resulting string contains non-ASCII characters, to transfer it from Mathematica, we must use one of methods described at the beginning of "Unicode" section.

Unicode-aware TeX engines

If you're using Unicode-aware $\TeX$ engine, e.g. xetex, you can simply use non-private Unicode characters from Mathematica in $\TeX$ input and output. But since automatic coloring of non-annotated identifiers in mmacells package relies on listings package, which doesn't work well with Unicode, this feature must be switched off, and all identifiers should be annotated. On the level of CellsToTeX package this can be achieved by switching off moving of commonest annotation types to $\TeX$ environments options ("CommonestTypesAsTeXOptions" -> False).

Clear[texMathReplacement]
StringJoin@Riffle[
    Prepend[
        CellToTeX[#, "ProcessorOptions" -> {
            "CommonestTypesAsTeXOptions" -> False,
            "StringBoxToTypes" -> {Automatic},
            "StringRules" -> 
                Join[
                    {"\[Equal]" -> "==", "\[Rule]" -> "→"},
                    $stringsToTeX, $commandCharsToTeX
                ],
            "NonASCIIHandler" -> Identity
        }] & /@ testCells,
        CellsToTeXPreamble["UseListings" -> False]
    ],
    "\n\n"
]
\mmaSet{uselistings=false,morefv={gobble=2}}

\begin{mmaCell}{Input}
  Solve[\mmaUnd{a} \mmaSup{\mmaFnc{χ1}}{2}+\mmaUnd{β} \mmaFnc{χ1}+\mmaUnd{γ}==0,\mmaFnc{χ1}]
\end{mmaCell}

\begin{mmaCell}{Output}
  \{\{χ1→\mmaFrac{-β-\mmaSqrt{\mmaSup{β}{2}-4 a γ}}{2 a}\},\{χ1→\mmaFrac{-β+\mmaSqrt{\mmaSup{β}{2}-4 a γ}}{2 a}\}\}
\end{mmaCell}

Print screen of pdf: Unicode xetex

Since resulting string contains non-ASCII characters, to transfer it from Mathematica, we must use one of methods described at the beginning of "Unicode" section.


Package design overview

In addition to main context, package provides also CellsToTeX`Configuration` context, with variables and functions useful for package customization. All CellsToTeX`Configuration`* symbols are considered part of package public interface.

Package main context provides CellToTeX function, which accepts whole Cell expressions or arbitrary boxes, reads options and passes all that data to a Processor function, that does the real work. Processor is a function that accepts and returns list of options. Since input of processor function has the same form as it's output, processor functions can be easily chained.

Processor function can be passed to CellToTeX in "Processor" option. If this option is not given, CellToTeX extracts default processor from "CellStyleOptions" option. This extraction is based on cell style, given explicitly as "Style" option or extracted from given Cell expression.

Currently package provides 11 processor functions, from which default processors, for different cell styles, are composed.

Some processor functions accept options. List of options for processors can be given to CellToTeX as value of "ProcessorOptions" option. Default values of processor options for different cell styles are extracted from "CellStyleOptions" option.


tl;dr: If you have a notebook you want to convert, put the following at the top of you notebook:

Import["https://raw.githubusercontent.com/jkuczm/\
MathematicaSyntaxAnnotations/master/SyntaxAnnotations/\
SyntaxAnnotations.m"]

Import["https://raw.githubusercontent.com/jkuczm/\
MathematicaCellsToTeX/master/NoInstall.m"]

Then put this at the bottom of your notebook:

SetOptions[CellToTeX, "CurrentCellIndex" -> Automatic];
ExportString[
 NotebookGet[] /. 
  cell : Cell[_, __] :> Cell[CellToTeX[cell], "Final"], "TeX", 
 "FullDocument" -> False, "ConversionRules" -> {"Final" -> Identity}]

It will output the code as a text output IN your notebook. Copy-paste the code to your latex document. Add \usepackage{mmacells} to the top of your latex document, and put this style file in your latex file's folder.