How to typeset ancient Persian scripts in LaTeX

Use XeLaTeX. The font maps the glyphs in the Latin alphabet slots, so you have to figure out yourself the correspondence.

\documentclass{article}
\usepackage{fontspec}

\newfontfamily{\oldpersian}{Parsoomash}

\begin{document}

This is ancient Persian script {\oldpersian ABCDFG}

\end{document}

enter image description here

Here's a table:

\documentclass{article}
\usepackage{fontspec}
\usepackage{multicol}

\newfontfamily{\oldpersian}{Parsoomash}

\begin{document}

This is ancient Persian script {\oldpersian ABCDFG}

\begin{multicols}{4}
\count255=32
\loop\ifnum\count255<125
\advance\count255 1
\hbox{\hbox to 1em{\symbol{\count255}\hss}\hbox{\oldpersian\symbol{\count255}}}
\repeat
\end{multicols}

\end{document}

enter image description here


mitayana

(1) The oldprsn font package from 2005, bundled with other archaic fonts into the archaic package, might help. It defines a special font family for Tex.

(2) Alternatively, in the unicode space:

Using Xelatex, fontspec, a unicode font containing glyphs belonging to the Old Persian code block, a font mapping file to enable easy input via combinations of "plain" characters, and adapting egreg's transliteration solution for Hittite (How to modify single (ranges of) characters within a custom environment?), we get:

sample syllables

MWE:

\documentclass[12pt]{article}
\usepackage{fontspec}
\usepackage{graphicx}
\usepackage{xcolor}
\usepackage{stackengine}


\setmainfont{Linux Libertine} 



\newcommand\operfont{Noto Sans Old Persian}
\newcommand\opercolour{blue}
\newcommand\opermapping{latin-to-oldpersian}
\newfontface\fmoper[Mapping=\opermapping ,Colour=\opercolour]{\operfont}



\DeclareTextFontCommand{\textoper}{\Large\fmoper}



\newenvironment{toper}
  {\fmoper\Large\ignorespaces}
  {\ignorespacesafterend}


\ExplSyntaxOn
\NewDocumentCommand{\eoper}{m}
 {
  \tl_set:Nn \l_xander_oper_tl { #1 }
  % change every run of lowercase letters into italic
  \regex_replace_all:nnN
   { [a-z]+ }
   { \c{textit}\cB\{\0\cE\} }
   \l_xander_oper_tl
% change every xx to name
  \regex_replace_all:nnN
   { ([xx]){2,2} }
   { |Xshaayathiya| }
   \l_xander_oper_tl
  % change every ch into c U+030c
  \regex_replace_all:nnN
   { ([ch]){2,2} }
   { c \x{030c} }
   \l_xander_oper_tl
  % change every th into θ = U+03b8
  \regex_replace_all:nnN
   { ([th]){2,2} }
   { \x{03b8} }
   \l_xander_oper_tl
  % change every ss into c U+0327
  \regex_replace_all:nnN
   { ([ss]){2,2} }
   { c \x{0327} }
   \l_xander_oper_tl  
   % change every s into s U+030C
  \regex_replace_all:nnN
   { ([sh]){2,2} }
   { s \x{030c} }
   \l_xander_oper_tl

   % change every am into AM
  \regex_replace_all:nnN
   { ([aur]){3,3} }
   { |AuraMazda| }
   \l_xander_oper_tl
   % change every dah into Dah
  \regex_replace_all:nnN
   { ([dah]){3,3} }
   { |Dahya \x{0304}ush| }
   \l_xander_oper_tl
   % change every baga into Baga
  \regex_replace_all:nnN
   { ([baga]){4,4} }
   { |BAGA| }
   \l_xander_oper_tl
   % change every buu into Buumish
  \regex_replace_all:nnN
   { ([buu]){3,3} }
   { |Buumish| }
   \l_xander_oper_tl

%  % change every double vowel aa/ into a/ U+0301
%  \regex_replace_all:nnN
%   { ([a,e,i,u]){2,2} }
%   { \1 \x{0301} }
%   \l_xander_oper_tl
%  % change |...| into \textsuperscript{...}
% change |...| into raised small-caps
  \regex_replace_all:nnN
   { \|([^|]+)\| }
   { \c{raisebox}\cB\{0.5ex\cE\}\cB\{\c{textsc}\cB\{\1\cE\}\cE\} 
    }
%    \raisebox{0.5ex}{yyy}
%   { \c{textsuperscript}\cB\{\c{textsc}\cB\{\1\cE\}\cE\} 
%    }
   \l_xander_oper_tl


   % change every buu into Buumish
  \regex_replace_all:nnN
   { ([Buumish]){7,7} }
   { Bu\x{0304}mis\x{030c} }
   \l_xander_oper_tl
% \x{0304}

  % print the result
  \tl_use:N \l_xander_oper_tl
 }
\ExplSyntaxOff



\newcommand\hstackonh[1]{\stackon{\begin{toper}#1\end{toper}}{\eoper{#1}}}  

\setstackgap{L}{1.24\normalbaselineskip}%4.25ex}
\newcommand\hstackon[1]{\Longstack{ \raisebox{1.12ex}{\eoper{#1}}  \begin{toper}#1\end{toper}}}  




\begin{document}
\section{Old Persian \textoper{la-da div pa-aur-ra-sa-i-a-na}}
For heading, see ref\footnote{\texttt{oldprsn} package (2005) documentation: "Old Persian in the Old Persian script" "(as near as possible)"}. \textoper{mi-ta-ya-na}.



\begin{center}
\hstackon{a}
\hstackon{i}
\hstackon{u}
\hstackon{ka}
\hstackon{ku}
\hstackon{ga}
\hstackon{gu}
\hstackon{xa}
\par
\hstackon{a-i-u-ka-ku-ga-gu-xa}
\end{center}


\begin{center}
\hstackon{cha}
\hstackon{tha}
\hstackon{sha}
\hstackon{ssa}
\par
\hstackon{cha-tha-sha-ssa}
\end{center}


\begin{center}
\hstackon{aur}
\hstackon{aur2}
\hstackon{aur3}
\hstackon{xx}
\par
\hstackon{aur-aur2-aur3-xx}
\end{center}


\begin{center}
\hstackon{dah}
\hstackon{dah2}
\hstackon{baga}
\hstackon{buu}
\par
\hstackon{dah-dah2-baga-buu}
\end{center}

\section{Usage}
(0) Compile the map text file with teckit\_compile to produce a binary tec file.

\verb|teckit_compile foo.map foo.tec|

Call the mapping file in the font command:

\verb|\newfontface\fontfoo[Mapping=foo]{SomeFontName}|

(1) To typeset in cuneiform, use \textbackslash\texttt{textoper} and type the mapping syllables: 

\verb|\textoper{a}| $\to$ \textoper{a}.

\textoper{da-a-ra-ya-va-ha-u-sha-} (son of Darius).

(2) To typeset transliteration with diacritics, type the mapping syllables and use \textbackslash\texttt{eoper}: 

\verb|\eoper{a}| $\to$ \eoper{a}.

\eoper{da-a-ra-ya-va-ha-u-sha-}.

(3) To typeset ruby text, use \textbackslash\texttt{hstackon}:

\verb|\hstackon{a}| $\to$ \hstackon{a} (per syllable).

\hstackon{da-a-ra-ya-va-ha-u-sha-} (per word).
\end{document}

Mapping file (to be compiled with teckit_compile):

; TECkit mapping for TeX input conventions <-> Unicode characters

LHSName "latin-to-oldpersian"
RHSName "UNICODE"

pass(Unicode)

; ligatures from Knuth's original CMR fonts
U+002D U+002D           <>  U+2013  ; -- -> en dash
U+002D U+002D U+002D    <>  U+2014  ; --- -> em dash

U+0027          <>  U+2019  ; ' -> right single quote
U+0027 U+0027   <>  U+201D  ; '' -> right double quote
U+0022           >  U+201D  ; " -> right double quote

U+0060          <>  U+2018  ; ` -> left single quote
U+0060 U+0060   <>  U+201C  ; `` -> left double quote

U+0021 U+0060   <>  U+00A1  ; !` -> inverted exclam
U+003F U+0060   <>  U+00BF  ; ?` -> inverted question

; additions supported in T1 encoding
U+002C U+002C   <>  U+201E  ; ,, -> DOUBLE LOW-9 QUOTATION MARK
U+003C U+003C   <>  U+00AB  ; << -> LEFT POINTING GUILLEMET
U+003E U+003E   <>  U+00BB  ; >> -> RIGHT POINTING GUILLEMET


;U+0020    >  U+0020 ;  space maps to space
U+002D    >  U+200D ;  hyphen as Zero Width Joiner
U+002E    >  U+200D ;  dot as Zero Width Joiner
U+007C    >  U+200C ;  pipe as Zero Width Non-Joiner



U+0061         <>  U+103A0    ;  a  
U+0069         <>  U+103A1    ;  i 
U+0075         <>  U+103A2    ;  u 
U+006B U+0061        <>  U+103A3    ;  ka 
U+006B U+0075        <>  U+103A4    ;  ku 
U+0067 U+0061        <>  U+103A5    ;  ga 
U+0067 U+0075        <>  U+103A6    ;  gu 
U+0078 U+0061        <>  U+103A7    ;  xa 
U+0063 U+0068 U+0061       <>  U+103A8    ;  cha 
U+006A U+0061        <>  U+103A9    ;  ja 
U+006A U+0069        <>  U+103AA    ;  ji 
U+0074 U+0061        <>  U+103AB    ;  ta 
U+0074 U+0075        <>  U+103AC    ;  tu 
U+0064 U+0061        <>  U+103AD    ;  da 
U+0064 U+0069        <>  U+103AE    ;  di 
U+0064 U+0075        <>  U+103AF    ;  du 
U+0074 U+0068 U+0061       <>  U+103B0    ;  tha 
U+0070 U+0061        <>  U+103B1    ;  pa 
U+0062 U+0061        <>  U+103B2    ;  ba 
U+0066 U+0061        <>  U+103B3    ;  fa 
U+006E U+0061        <>  U+103B4    ;  na 
U+006E U+0075        <>  U+103B5    ;  nu 
U+006D U+0061        <>  U+103B6    ;  ma 
U+006D U+0069        <>  U+103B7    ;  mi 
U+006D U+0075        <>  U+103B8    ;  mu 
U+0079 U+0061        <>  U+103B9    ;  ya 
U+0076 U+0061        <>  U+103BA    ;  va 
U+0076 U+0069        <>  U+103BB    ;  vi 
U+0072 U+0061        <>  U+103BC    ;  ra 
U+0072 U+0075        <>  U+103BD    ;  ru 
U+006C U+0061        <>  U+103BE    ;  la 
U+0073 U+0061        <>  U+103BF    ;  sa 
U+007A U+0061        <>  U+103C0    ;  za 
U+0073 U+0068 U+0061       <>  U+103C1    ;  sha 
U+0073 U+0073 U+0061       <>  U+103C2    ;  ssa 
U+0068 U+0061        <>  U+103C3    ;  ha 


U+0061 U+0075 U+0072       <>  U+103C8    ;  aur 
U+0061 U+0075 U+0072 U+0032      <>  U+103C9    ;  aur2 
U+0061 U+0075 U+0072 U+0033      <>  U+103CA    ;  aur3 
U+0078 U+0078        <>  U+103CB    ;  xx 
U+0064 U+0061 U+0068       <>  U+103CC    ;  dah 
U+0064 U+0061 U+0068 U+0032      <>  U+103CD    ;  dah2 
U+0062 U+0061 U+0067 U+0061      <>  U+103CE    ;  baga 
U+0062 U+0075 U+0075       <>  U+103CF    ;  buu 


U+0064 U+0069 U+0076       <>  U+103D0    ;  div 
U+0031         <>  U+103D1    ;  1 
U+0032         <>  U+103D2    ;  2 
U+0031 U+0030        <>  U+103D3    ;  10 
U+0032 U+0030        <>  U+103D4    ;  20 
U+0031 U+0030 U+0030       <>  U+103D5    ;  100 

Adjust the mapping, and the matching diacritic regexes as required. It all depends on what input method you have or want (for example, there are various Old Persian virtual keyboards available for download from linguistics sites; or you can program your own).

Tags:

Persian