Chemfig submolecule that takes an argument

I brought up a similar question once to Christian Tellechea, the author of the chemfig package, and he was kind enough to provide me with some custom code to solve it. I paste that code in the example below. I hope you don't mind my using some example molecules of my own, rather than yours.

\documentclass{article}

\usepackage{chemfig,xstring}

\makeatletter

% he sent me this later ... it seems this is simply a slight refactoring.

\newcommand*\if@csfirst[1]{%
    \csname @\ifcat\relax\expandafter\noexpand\@car#1\@nil first\else second\fi oftwo\endcsname
}

\newcommand*\derivesubmol[4]{%
    \saveexpandmode\saveexploremode\expandarg\exploregroups
    \if@csfirst{#2}
        {\expandafter\StrSubstitute\@car#2\@nil}
        {\expandafter\StrSubstitute\csname CF@@#2\endcsname}
    {\@empty#3}{\@empty#4}[\temp@]%
    \if@csfirst{#1}
        {\expandafter\let\@car#1\@nil}
        {\expandafter\let\csname CF@@#1\endcsname}\temp@
    \restoreexpandmode\restoreexploremode
}

\newcommand*\showsubmol[1]{%
    \if@csfirst{#1}%
        {\begingroup submol "\expandafter\showsubmol@i\string#1" = \ttfamily
        \expandafter\expandafter\expandafter\def\expandafter\expandafter\expandafter#1%
            \expandafter\expandafter\expandafter{\expandafter\@gobble#1}%
        \expandafter\expandafter\expandafter\strip@prefix\expandafter\meaning\@car#1\@nil
        \endgroup}%
        {\expandafter\showsubmol\csname CF@@#1\endcsname}%
}

\def\showsubmol@i#1#2#3#4#5{}

\newcommand*\exp@addtomacro[2]{\expandafter\@xs@addtomacro\expandafter#1\expandafter{#2}}

\newcommand*\expandsubmol[1]{%
    \if@csfirst{#1}%
        {\saveexpandmode\saveexploremode\expandarg\noexploregroups
        \let\parsed@mol\@empty\let\remain@mol#1%
        \IfSubStr#1!%
            {\expandsubmol@i
            \let#1\parsed@mol
            }%
            \relax
        }%
        {\expandafter\expandsubmol\csname CF@@#1\endcsname}%
}

\newcommand*\expandsubmol@i{%
    \StrBefore\remain@mol![\temp@]%
    \exp@addtomacro\parsed@mol\temp@
    \StrBehind\remain@mol![\remain@mol]%
    \StrSplit\remain@mol\@ne\remain@mol\temp@
    \StrRemoveBraces\remain@mol[\remain@mol]%
    \expandafter\if@csfirst\expandafter{\remain@mol}%
        {\expandafter\let\expandafter\remain@mol\remain@mol}%
        {\expandafter\let\expandafter\remain@mol\csname CF@@\remain@mol\endcsname}%
    \StrGobbleLeft\remain@mol\@ne[\remain@mol]%
    \exp@addtomacro\remain@mol\temp@
    \IfSubStr\remain@mol!%
        \expandsubmol@i
        {\exp@addtomacro\parsed@mol\remain@mol
        \restoreexpandmode\restoreexploremode
        }%
}

\makeatother

% cosmetic enhancements for the example 
\setcrambond{1.75pt}{0.4pt}{1.0pt} % previously too crammed for print.
\setatomsep{18pt}
\renewcommand*\printatom[1]{\ensuremath{\mathsf{#1}}}

\tikzset{ % bond in the foreground
    fgbond/.style={% foreground bond - connecting two cram bonds.
        line width=1.6pt,
        shorten <=-.6pt,
        shorten >=-.6pt
   }
}

% define named substituent dummies. Not strictly necessary, but useful for 
% visual display of the template that contains them. 
\definesubmol{rt1}{-[:90]rt1}
\definesubmol{rt2}{-[:-90,0.6]rt2}


% define a template that contains named dummy substituents that can be replaced
\definesubmol{ribosetemplate}{%
        (
            -[:28,1.508]O
            -[:-28,1.508]
        )
    <[:-45]
        (
          -[0,1.25,,,fgbond]
              (!{rt2})
          >[:45]
              (!{rt1})
         )
}


% partially specialize the template by supplying a substituent for the ribose
\derivesubmol{ribonucleoside}{ribosetemplate}{!{rt2}}{-[6,0.9]OH}
\derivesubmol{deoxyribonucleoside}{ribosetemplate}{!{rt2}}{}


% define some more submols
\definesubmol{adenine}{N*5([::-18]-*6(-N=-N=(-NH_2)-)=-N=-)}
\redefinesubmol{guanine}{N*5([::-18]-*6(-N=(-NH_2)-NH-(=O)-)=-N=-)}

% put those into the second position
\derivesubmol{guanosine}{ribonucleoside}{!{rt1}}{-[2]!{guanine}}
\derivesubmol{deoxyadenosine}{deoxyribonucleoside}{!{rt1}}{-[2]!{adenine}}

\begin{document}

% display the template, so that we know what is where
\chemfig{entry-[6]!{ribosetemplate}-[6]exit}
\hspace{1in}
% put it all together
\chemfig{X-[6,2]!{guanosine}-[6,2]{stuff}-[6,2]!{deoxyadenosine}-[6,2]B}

\end{document}

This produces

enter image description here

The code between \makeatletter and \makeatother is written by Christian and is as clear to me as some Cuneiform script; don't ask me about it. It depends on the xstring package, also written by Christian. However, it is easy enough to use. The idea is to define a template molecule with string placeholders, which can then be replaced with different substituents using the \derivesubmol macro. I think this macro would be a useful addition to the (already wonderful) chemfig package.


There are two issues involved. The first one is how to obtain variable parts in a molecule. The second one is how to use formulas in a tikzpicture.

As far as I understand, there are two possibilities to have variables in molecules.

  • Define submolecules.

    \definesubmol{X}{OCH_3}
    \definesubmol{Y}{OSO_3^{-}}
    \newcommand*\HOOC{HOOC-[1]-[2]-[1]*6(=-=(-!{X})-(-!{Y})=-)}
    \chemfig*{!\HOOC}
    

    Using \redefinesubmol{X}{...} you can set the names to new sub-molecules such that \chemfig*{!\HOOC} will result in a different molecule.

  • Define the \HOOC command including \chemfig.

    \newcommand*{\cfHOOC}[2]{\chemfig*{HOOC-[1]-[2]-[1]*6(=-=(-#1)-(-#2)=-)}}
    \cfHOOC{OCH_3}{OSO_3^{-}}
    

It doesn't seem to be possible to have a macro with arguments after !.

The second issue is related to the fact (?) that \chemfigs are tikz-pictures and that nesting of tikz-pictures may not work. A solution is to typeset the formulas outside of the tikz-picture into a box and to use this box inside the tikz-picture.

\newsavebox\formulaA
\savebox\formulaA{\cfHOOC{OH}{O-[:30](-[::25](-[:190]OH)-[:-15](-[:75]OH)-[:15]?-[:-15]OH)(-[::-15]O-[::-30]?-[0]COOH)}}

\newsavebox\formulaB
\savebox\formulaB{\cfHOOC{OCH_3}{OSO_3^{-}}}

\begin{tikzpicture}
... \usebox\formulaA ... \usebox\formulaB ...
\end{tikzpicture}

Here is the document and the corresponding code.

enter image description here

\documentclass[border=1mm]{standalone}
\usepackage{chemfig}
\renewcommand*\printatom[1]{\ensuremath{\mathsf{#1}}}
\setatomsep{2em}
\usetikzlibrary{positioning}
\begin{document}
\newcommand*{\cfHOOC}[2]{\chemfig*{HOOC-[1]-[2]-[1]*6(=-=(-#1)-(-#2)=-)}}

\newsavebox\formulaA
\savebox\formulaA{\cfHOOC{OH}{O-[:30](-[::25](-[:190]OH)-[:-15](-[:75]OH)-[:15]?-[:-15]OH)(-[::-15]O-[::-30]?-[0]COOH)}}

\newsavebox\formulaB
\savebox\formulaB{\cfHOOC{OCH_3}{OSO_3^{-}}}

\begin{tikzpicture}[label position={below}]
    \node[label={Dihydrocaffeic acid-3-O-glucuronide}] (DA3OG) {\usebox\formulaA};
    \node[label={Dihydroferulic acid-4-O-sulfate}, below left=-1.5cm and 3cm of DA3OG] {5\usebox\formulaB};
 \end{tikzpicture}
\end{document}

Edit: As it seems, these particular \chemfigs also work when using them directly in the tikz-picture. Just remember that when encountering errors this may be due to nested tikz-pictures.


Following what @gernot has proposed (now known to be the contents of the manual) but using an approach without the \saveboxes - which may implicate in future issues but so far none has been encountered - it's possible to define the submolecule dependent of other submolecules (in macro form):

\definesubmol{hooc}{HOOC-[1]-[2]-[1]*6(=-=(-!{\X})-(-!{\Y})=-)}

Then we make a command to renew this macros \X and \Y as needed:

\newcommand*{\radicals}[2]{\edef\X{#1}\edef\Y{#2}}

The reason behind using macros (and \edef instead of \def) for the independent submolecules is that then we can define several other submolecules and use them too, e.g. \def\metil{CH_3}, this example is stupid but in the MWE below there's one more convincing:

\documentclass{standalone}
\usepackage{tikz,chemfig}

\renewcommand*\printatom[1]{\ensuremath{\mathsf{#1}}} % Uses sf font
\setatomsep{2em} % Sets atom separation
\usetikzlibrary{positioning}

\newcommand*{\radicals}[2]{\edef\X{#1}\edef\Y{#2}} % Macro that renews the radicals
\definesubmol{hooc}{HOOC-[1]-[2]-[1]*6(=-=(-!{\X})-(-!{\Y})=-)} % Dependent submol
\def\glucuronide{% Complex independent submol
 O-[:30](-[::25](-[:190]OH)-[:-15](-[:75]OH)-[:15]?-[:-15]OH)(-[::-15]O-[::-30]?-[0]COOH)
}

\begin{document}
  \begin{tikzpicture}[label position={below}]
    \node[label={Dihydrocaffeic acid-3-O-glucuronide}] (DA3OG)
         {\radicals{OH}{\glucuronide}\chemfig*{!{hooc}}};
    \node[label={Dihydroferulic acid-4-O-sulfate}, below left=-1.5cm and 3cm of DA3OG]
         {\radicals{OCH_3}{OSO_3^{-}}\chemfig*{!{hooc}}};
  \end{tikzpicture}
\end{document}

Results in the desired output

enter image description here

Tags:

Chemfig