Fill a lua table with lowercase/uppercase pairs.
Both Miktex and TL includes file UnicodeData.txt
, which contains all necessary information. It contains lines in the following form:
0061;LATIN SMALL LETTER A;Ll;0;L;;;;;N;;;0041;;0041
There are several fields delimited with semicolon. Important fields are first, which is current character codepoint, fourth, which is class of character and fifteenth, which contains codepoint of corresponding uppercase character.
We can write simple Lua library which will parse the file and return table with necessary information:
local unicode_data = kpse.find_file("UnicodeData.txt")
local characters = {}
for line in io.lines(unicode_data) do
local fields = line:explode ";"
-- we want to process only uppercase letters
if fields[3] == "Ll" then
local lowercase = tonumber(fields[1],16)
-- uppercae codepoint is in field 15
-- some uppercase letters doesn't have lowercase versions
local uppercase = tonumber(fields[15],16)
characters[lowercase] = uppercase
end
end
return characters
We test for Ll
class, which is lowercase letters and construct table with uppercase codepoints. Note that some lowercase chars doesn't have coresponding upeercases, but that's OK, they will not be included in the table.
It can be used in the following way:
\documentclass{article}
\directlua
{
local lowercases = require "makelowercases"
lowercases["ß"] = {"S","S"}
fonts.handlers.otf.addfeature
{
name = "vircase",
type = "multiple",
data = lowercases
}
}
\usepackage{fontspec}
\setmainfont{OpenSans-Regular.ttf}%
[
RawFeature=+vircase,
]
\begin{document}
AAAA aaaa ü ß ɒ e o
Hallo Welt!
\end{document}
It will produce the following result:
I would use the included unicode
Lua module and fill the uppercase table by a loop, like this:
\documentclass{article}
\directlua
{
local upper = unicode.utf8.upper
local char = unicode.utf8.char
local data = {}
for c = 0x20, 0x0500 do
data[char(c)] = {upper(char(c))}
end
data["ß"] = {"S","S"}
fonts.handlers.otf.addfeature {
name = "vircase",
{
type = "multiple",
data = data,
}
}
}
\usepackage{fontspec}
\setmainfont{CMU Serif}%
[
RawFeature=+vircase,
]
\begin{document}
AAAA aaaa ü ß ɒ e o
Hallo Welt!
Привет, Мир!
\end{document}
Since I use TexLive 2016, the syntax of fonts.handlers.otf.addfeature
arg is a bit different, you can adjust it. I've limited the loop up to 0x0500
which covers Latin scripts, Greek and Cyrillic. Some Cyrillic example is also added (and works!).