[Tex/LaTex] Detect if all required glyphs are available

fontsloggingsymbolsxetex

In XeTeX, a message is printed to the log file if a glyph is not available with the current font. The input

\documentclass{scrartcl}

\begin{document}
  α
\end{document}

leads to the following entry in the log:

Missing character: There is no α in font cmr10!

Is there a way to check for a given input if all glyphs required to render it are available, without relying on the log file? Apparently, there is no Command to flush log file, and (in a scenario where I pipe into a running XeLaTeX process) I'd like to know before XeTeX is terminated.

Best Answer

Longer post

This task was a big issue for me many years ago when I was typesetting and generating preview books of fonts where I needed to know all kind information about the glyphs before actual typesetting. I divided the answer into three small parts each with compilable TeX file.


Part 1: xelatex and its tool

There is a way in xelatex, it is mentioned indirectly on page 8 in the reference manual. We can use the \XeTeXcharglyph command and check if the output is 0 (or .notdef if checking glyph name). Once any glyph is not defined the zero value is assigned to the character slot.

After that step we can use \ifnum...\else...\fi statement to decide what to typeset, if anything at all. In the example I am testing four letters, ? A \ a, two are defined, the remaining two are not. I have downloaded and installed RodgauApesInitials from Manfred Klein's collection, http://moorstation.org/typoasis/designers/klein04/deco/rodgau_apes.htm

We run xelatex mal-kanji1.tex and terminal says:

The glyph slot of 003F(hex) is: 0 (.notdef)
The glyph slot of 0041(hex) is: 3 (A)
The glyph slot of 005C(hex) is: 0 (.notdef)
The glyph slot of 0061(hex) is: 29 (a)

This is the code:

% run: xelatex mal-kanji1.tex
\documentclass[a4paper]{article}
\pagestyle{empty}
\usepackage{xltxtra}
\usepackage{fontspec}
\usepackage{pgffor}
\begin{document}
%\font\malfont=RodGauApes.ttf
\font\klein="RodGauApes Initials"
\font\cmr=cmr10%
\message{^^J}% One \n to the terminal...
% U+0041, A, dec 65
\def\checkme#1{%
\klein\message{The glyph slot of #1(hex) is: \the\XeTeXcharglyph"#1 \space 
  (\XeTeXglyphname\klein\XeTeXcharglyph"#1)^^J}%
%! Cannot use \XeTeXcharglyph with cmr10; not a native platform font.
\ifnum\XeTeXcharglyph"#1=0%
\cmr Undefined glyph!%
  \else 
\klein\char"#1%
\fi% End of \ifnum...
}% End of \checkme...
\foreach \malglyph in {003F,0041,005C,0061}% ? A \ a
  {\checkme\malglyph\cmr\par%
  }% End of \foreach...
\end{document}

mwe, part 1

I needed and I used this strategy when I typesetted a preview book of kanji. I was using many different font faces and I needed to know exactly how many glyphs were going to be typeset. If a glyph or more were missing it was up to me to decide if to typeset that rectangle, or not. I should say that it is normal that a font is missing some CJKV characters, there are so many of them. I enclose a snapshot from the book, it is a preview of a single Japanese character/kanji.

book of kanji: a snapshot

The big disadvantage is that we can use this strategy only with the xelatex format and we can check only newer fonts (TTF, OTF), it is not working with PFB. That's a serious limitation in work. Let me show you other approach.


Part 2: testfont.tex

One of tools in the TeX distribution is testfont.tex file. If we run, e.g.:

pdftex testfont.tex

We are asked about the name of the font, let's use dmjhira and hit Enter. There is a list of options when we use \help and hit Enter. The common option is to write \table\bye followed by Enter. We are getting the testfont.pdf file and this is a preview of it.

mwe, part 2

A problem is that we cannot customize it and we cannot enter TTF and OTF fonts without installing and setting up the fonts. It is worth checking another approach.

Part 3: Measuring a TeX box (Hiragana and Katakana as test cases)

The core of TeX is in measuring boxes. It is the approach we are going to use and test. We will test a single glyph, but we could easily measure almost anything. After defining a new box (\newbox), we virtually fulfill the box (\setbox) and we can measure width, height and depth of the box (\wd, \ht and \dp).

In the example below we typeset Hiragana, a Japanese syllabary, but we may spot several defects. There is an opening space, lines don't break ideally, spaces are shrinking and stretching, we have no idea how many glyphs were typesetted. But it is a starting point for further explorations.

Worth mentioning is that we can run any major latex engine and if we use e.g. fontspec package, we can test glyphs in the TTF, OTF and some dfont files when running lualatex or xelatex. I enclose the TeX code and a preview of the result.

% run: *latex mal-kanji2.tex
\documentclass[a4paper]{article}
\pagestyle{empty}% No page number please...
\parindent=0pt% No indentation please...
\rightskip=6cm% And TeXie, :-), please narrow the text width somehow...
\begin{document}
\font\hira=dmjhira at 2ex% Setting up a new font face (Hiragana)...
\newcount\malcounter% Setting up a new counter...
\malcounter=-1% The initialization of the counter
\loop% The core of this example...
  \advance\malcounter by 1% Move on to the next glyph...
  {\hira\char\malcounter} % Show me the glyph, add space...
  %\discretionary{}{}{}% Allow a page break after the glyph... (the first alternative)
   \allowbreak% (the second alternative)
\ifnum\malcounter<255\repeat% Run all 256 characters in the font...
\end{document}

mwe, part 3a


We typeset a glyph only when glyph width is a nonzero value. There are some exceptions in ornament fonts, but it is considered bad fontography. We add a counter (\newcount) taking care of a number of typesetted glyphs, we limit spaces to fixed dimension (\fontdimen 3, 4 and 7), we allow line breaks, simply said we are having full control over the output now.

I enclose an example and a preview of the PDF file with Katakana, a Japanase syllabary.

% run: *latex mal-kanji3.tex
\documentclass[a4paper]{article}
\pagestyle{empty}
\parindent=0pt
\begin{document}
\newbox\emptybox
\setbox\emptybox=\hbox{}
\newbox\malbox
\font\kata=dmjkata% A new font face to be used (Katakana)...
\newcount\counter 
\counter=-1% The counter of glyphs...
\fontdimen3\font=0pt \fontdimen4\font=0pt \fontdimen7\font=0pt 
% Eliminate the stretch in spaces, or, we could use \makebox...
\loop% Process all the glyphs...
  \advance\counter by 1% Go to the next glyph...
  \setbox\malbox=\hbox{\kata\char\counter}% Measure the glyph...
  \ifnum\wd\malbox=\wd\emptybox\relax\else% Is width 0pt? Height is not tested (\ht, \dp)...
    \texttt{\ifnum\counter<10 0\fi% Add zero in front of number <10...
    \the\counter.\copy\malbox\ }% Show me the glyph...
    \discretionary{}{}{}%Allow the line break, or, \allowbreak...
  \fi% End of \ifnum condition...
\ifnum\counter<255\repeat% Show me all the glyphs in range 0-255...
\end{document}

mwe, part 3b


Conclusions

There are some limitations of these methods. We cannot test if two glyphs are not just rotated, flipped or scaled, we are also not listing glyphs outside 0-255 region. In case you need something like that, use FontForge, it can be done there as it has own scripting language and a support for Python scripts.

In case you are wondering what was that splash in the very first example in the last letter under the letter: Oh, yes, that's an ape from the great Manfred Klein legendary font creator himself! I enclose a screenshot from FontForge and a preview of his font with 26+26 letters in it.

Manfred Klein, part 1

Manfred Klein, part 2