[Tex/LaTex] Font substitution with XeLaTeX

fontssubstitutionxetex

I have an automatically generated document (using Doxygen) that has
many different Unicode characters for many languages: Hebrew, Japanese, Greek, Arabic and more.

Some parts of text displayed using monospace fonts.

I've tried to use the DejaVu Sans font, which includes most of glyphs but I still
miss some Japanese characters and some characters in the monospace fonts.

Is there any way to tell XeLaTeX to make automatic substitutions if the
font is missing glyphs?

So it would for example use one font as main and if something is missing
would fall back to another font or at least to a non-monospace font
that has the correct glyphs?

Best Answer

As Andrey points out, the key is the concept of "interchar tokens". Suppose you have some CJK characters that doesn't appear in the normal font and you want to use another for them; say that the characters are U+4E01 and U+4E02.

Then the following will do:

\usepackage{fontspec}
\setmainfont[Ligatures=TeX]{<a font>}
\newfontfamily{\JapSubstFont}{<another font with the missing chars>}

\XeTeXinterchartokenstate=1
\newXeTeXintercharclass\JapSubst

\XeTeXcharclass"4E01=\JapSubst
\XeTeXcharclass"4E02=\JapSubst

\XeTeXinterchartoks 0 \JapSubst = {\begingroup\JapSubstFont}
\XeTeXinterchartoks 255 \JapSubst = {\begingroup\JapSubstFont}
\XeTeXinterchartoks \JapSubst 0 = {\endgroup}
\XeTeXinterchartoks \JapSubst 255 = {\endgroup}

Each time that a character in the \JapSubst charclass will be at the start or end of a "word" or be preceded or followed by a "normal" character, it will be typeset in the \JapSubstFont.

One can also specify a range:

\count255="4E01
\loop\ifnum\count255<"5000
  \XeTeXcharclass\count255=\JapSubst
  \advance\count255 by 1
\repeat