How to find out which unicode codepoints are defined in a TTF file?
I found a python library, fonttools (pypi) that can be used to do it with a bit of python scripting.
Here is a simple script that lists all fonts that have specified glyph:
#!/usr/bin/env python3
from fontTools.ttLib import TTFont
import sys
char = int(sys.argv[1], base=0)
print("Looking for U+%X (%c)" % (char, chr(char)))
for arg in sys.argv[2:]:
try:
font = TTFont(arg)
for cmap in font['cmap'].tables:
if cmap.isUnicode():
if char in cmap.cmap:
print("Found in", arg)
break
except Exception as e:
print("Failed to read", arg)
print(e)
First argument is codepoint (decimal or hexa with 0x) and the rest is font files to look in.
I didn't bother trying to make it work for .ttc
files (it requires some extra parameter somewhere).
Note: I first tried the otfinfo tool, but I only got basic multilingual plane characters (<= U+FFFF). The python script finds extended plane characters OK.
otfinfo looks promising:
-u, --unicode
Print each Unicode code point supported by the font, followed by
the glyph number representing that code point (and, if present,
the name of the corresponding glyph).
For example DejaVuSans-Bold knows about the fl ligature(fl):
$ otfinfo -u /usr/share/fonts/TTF/DejaVuSans-Bold.ttf |grep ^uniFB02
uniFB02 4899 fl