How to break long words after n chars (long genomic sequences)
The seqsplit
package will break up such expressions, by adding suitable break points. It is designed exactly for these types of DNA sequences, and copes in a sophisticated way with various forms of input. However, you wish to break your material after a specific number of characters instead. This can be achieved with the commands provided by the xstring
package, via its splitting command \StrSplit
:
\documentclass{article}
\usepackage{xstring,etoolbox}
\newcommand{\fixsplit}[2]{\StrLen{#2}[\mynum]\ifnumcomp{\mynum}{<}{\numexpr(#1)+1\relax}%
{#2}%
{\StrSplit{#2}{#1}{\myfirststr}{\mysecondstr}\myfirststr\linebreak
\fixsplit{#1}{\mysecondstr}}}
\begin{document}
\begin{quote}
\ttfamily
\fixsplit{30}{CTCCTTGGGCTGTTATTCCGTAAAAGTATTTGTGGAAGATACGGCTGTCATACATGATATGTTTTTTGTTTATAACAATAGTTCTTTCTTTGATTTCACCATAGGTTGCCTCAAATTGCTCTTTTGTTGCTTGTCCAGCTGTTAAGACTAAATGTTTTGACCCCTCATTTATAAGACCGATTGCGTTGAATGGTAAGACATTCTGTTGTGCTGATTGTAATTCTGAATAGCTACGGATTTTTATGAAGATATAGTTTTTTAATATTGGTATTTCATTCCAGACATACTTCTGTATAAAGGATTTATTAAACGGTGTTGTTTTGATTGCTCTATAATACTTATCTTGTTGTCCTCTTAATTTTACCCAAGGTCTTTCAAACTCTTGGGAGTTAATGATTATAAGCATATTGTAAAGCTGTCCAGCTAATCCGAAGAATACTGGAAGCCAGTGGGTAAAGCTTGTCTGTTTTGGTAAAGCTGTTTGAACGTCTGACAAGAACAAGTCCAGACCTTCATATTTGTGGATTTTTTGAAACTTCATATTTTGATATGAACCGTCTACAATATCACTATATTTTACTGGTTGCCCAGTTTTTTGATTAATGTATCCAGGTCTTTAATATCTACTACTAAAACCACCGTAACCATAGTCCACGTTAGAGATATAGAGAGGTTTCGCATAAATGTGAACCCAGATTGCTTGTTGTTGTCTTTCATAACTCATTTGAAGACCAGTTTTAATGCGTTCTTTAATTGCTTGATACGTT}
\end{quote}
\end{document}
Note that I have chosen to print the result with a fixed width font, otherwise you get a rather strange effect. Also note that the way xstring
works, results of operations usually have to be stored in a macro, rather than being used directly.
If you can change and replace
selected text with your editor, then replace C
by C\brk{}
, G
by G\brk{}
, A
by A\brk{}
and T
by T\brk{}
in your long strings. If you don't want to have text disappearing out of view, use a %
at linebreaks
\documentclass{article}
\newcommand*{\brk}{\discretionary{}{}{}\hfil}
\begin{document}
\noindent C\brk{}T\brk{}C\brk{}C\brk{}T\brk{}T\brk{}G\brk{}G\brk{}G\brk{}C\brk{}T\brk{}G\brk{}T\brk{}T\brk{}A\brk{}T\brk{}T\brk{}C\brk{}C\brk{}G\brk{}T\brk{}A\brk{}A\brk{}A\brk{}A\brk{}G\brk{}T\brk{}A\brk{}T\brk{}T\brk{}T\brk{}G\brk{}T\brk{}G\brk{}G\brk{}A\brk{}A\brk{}G\brk{}A\brk{}T\brk{}A\brk{}C\brk{}G\brk{}G\brk{}C\brk{}T\brk{}G\brk{}T\brk{}C\brk{}A\brk{}T\brk{}A\brk{}C\brk{}A\brk{}T\brk{}G\brk{}A\brk{}T\brk{}A\brk{}T\brk{}G\brk{}T\brk{}T\brk{}T\brk{}T\brk{}T\brk{}T\brk{}G\brk{}T\brk{}T\brk{}T\brk{}A\brk{}T\brk{}A\brk{}A\brk{}C\brk{}A\brk{}A\brk{}T\brk{}A\brk{}G\brk{}T\brk{}T\brk{}C\brk{}T\brk{}T\brk{}T\brk{}C\brk{}T\brk{}T\brk{}T\brk{}G\brk{}A\brk{}T\brk{}
\hfill\mbox{}
\medskip
\noindent
C\brk{}T\brk{}C\brk{}C\brk{}T\brk{}T\brk{}G\brk{}G\brk{}G\brk{}C\brk{}%
T\brk{}G\brk{}T\brk{}T\brk{}A\brk{}T\brk{}T\brk{}C\brk{}C\brk{}G\brk{}%
T\brk{}A\brk{}A\brk{}A\brk{}A\brk{}G\brk{}T\brk{}A\brk{}T\brk{}T\brk{}%
T\brk{}G\brk{}T\brk{}G\brk{}G\brk{}A\brk{}A\brk{}G\brk{}A\brk{}T\brk{}%
A\brk{}C\brk{}G\brk{}G\brk{}C\brk{}T\brk{}G\brk{}T\brk{}C\brk{}A\brk{}%
T\brk{}A\brk{}C\brk{}A\brk{}T\brk{}G\brk{}A\brk{}T\brk{}A\brk{}T\brk{}%
G\brk{}T\brk{}T\brk{}T\brk{}T\brk{}T\brk{}T\brk{}G\brk{}T\brk{}T\brk{}%
T\brk{}A\brk{}T\brk{}A\brk{}A\brk{}C\brk{}A\brk{}A\brk{}T\brk{}A\brk{}%
G\brk{}T\brk{}T\brk{}C\brk{}T\brk{}T\brk{}T\brk{}C\brk{}T\brk{}T\brk{}%
T\brk{}G\brk{}A\brk{}T\brk{}
\hfill\mbox{}
\end{document}
Can you please try this one:
\documentclass{article}
\begin{document}
\parindent=0pt
\ttfamily
\makeatletter
\def\xfoo#1#2{\@tempcnta=0%
\@tfor\xx:=#2\do{\advance\@tempcnta 1%
\xx\ifnum\the\@tempcnta=#1\newline\@tempcnta=0\fi%
}%
}
\xfoo{10}{CTCCTTGGGCTGTTATTCCGTAAAAGTATTTGTGGAAGATACGGCTGTCATACATGATATGTTTTTTGTTTATAACAATAGTTCTTTCTTTGATTTCACCATAGGTTGCCTCAAATTGCTCTTTTGTTGCTTGTCCAGCTGTTAAGACTAAATGTTTTGACCCCTCATTTATAAGACCGATTGCGTTGAATGGTAAGACATTCTGTTGTGCTGATTGTAATTCTGAATAGCTACGGATTTTTATGAAGATATAGTTTTTTAATATTGGTATTTCATTCCAGACATACTTCTGTATAAAGGATTTATTAAACGGTGTTGTTTTGATTGCTCTATAATACTTATCTTGTTGTCCTCTTAATTTTACCCAAGGTCTTTCAAACTCTTGGGAGTTAATGATTATAAGCATATTGTAAAGCTGTCCAGCTAATCCGAAGAATACTGGAAGCCAGTGGGTAAAGCTTGTCTGTTTTGGTAAAGCTGTTTGAACGTCTGACAAGAACAAGTCCAGACCTTCATATTTGTGGATTTTTTGAAACTTCATATTTTGATATGAACCGTCTACAATATCACTATATTTTACTGGTTGCCCAGTTTTTTGATTAATGTATCCAGGTCTTTAATATCTACTACTAAAACCACCGTAACCATAGTCCACGTTAGAGATATAGAGAGGTTTCGCATAAATGTGAACCCAGATTGCTTGTTGTTGTCTTTCATAACTCATTTGAAGACCAGTTTTAATGCGTTCTTTAATTGCTTGATACGTT}
\end{document}