[Tex/LaTex] I get tofu instead of Lepcha (tibeto-burmese script)

expexfontspeclinguisticspolyglossiaxetex

for a school assignment I need to display some Lepcha. This is what I had initially:

\documentclass[12pt]{article}
\usepackage[a4paper]{geometry}
\usepackage{fontspec}
\setmainfont{Gentium Plus}

\usepackage{parskip}
\usepackage{multicol}
\usepackage{vowel}
\usepackage{pifont}
\newcommand{\BlankCell}{}

\usepackage{polyglossia}
\setdefaultlanguage{german} 
\setotherlanguages{bengali, hindi, malayalam, khmer, french, greek, russian, thai, armenian, tibetan, lepcha} 

\overfullrule=2cm % displays black bars where the line extends over the edge of the page. 
%THE OTHER SCRIPTS ARE DEFINED LIKE THIS, SO I FIGURED I CAN DO THE SAME WITH LEPCHA.
\newfontfamily\lepchafont[Script=Lepcha]{Noto Sans Lepcha}
\newfontfamily\bengalifont[Script=Bengali]{Shonar Bangla}
\newfontfamily\hindifont[Scr... and so on

The other scripts are defined like this, so I figured I could just add another one.
Now, this is a header file that is loaded into another file, but the error redirects here directly and I suppose it has to do with Lepcha not being defined in the polyglossia package, but I don't know how to define it separately. This is the error I got:

LaTeX error: "kernel/key-choice-unknown" Key 'fontspec/Script' accepts
only a fixed set of choices. For immediate help type H .
…epchafont[Script=Lepcha]{Noto Sans Lepcha}

I tried adding it in a similar fashion as Kana and Hangul, but that also didn't do the trick. This is what that miserable attempt looks like:

\usepackage{xeCJK}
\xeCJKDeclareSubCJKBlock{Kana}{"3040 -> "309F, "30A0 -> "30FF, "31F0 -> "31FF, "1B000 -> "1B0FF}
\xeCJKDeclareSubCJKBlock{Hangul}{"1100 -> "11FF, "3130 -> "318F, "A960 -> "A97F, "AC00 -> "D7AF, "D7B0 -> "D7FF}
\xeCJKDeclareSubCJKBlock{Lepcha}{"1C00 -> "1C4F}

The following code is supposed to output like in the picture below, but doesn't as of now.

...
\definelingstyle{Lepcha}{}
...

\ex[lingstyle=Lepcha]<lepcha>
\begingl 
\glpreamble \rightcomment{[Lepcha]}  ᰃᰨ ᰎᰥᰨᰜᰤᰦᰵ ᰀᰩᰰ ᰍᰩᰵᰡᰨ //
\gla go proljaŋ-kɔn nɔŋ-ʃo//
\glb 1\textsc{sg} Bhutan-side go-\textsc{nprt}//
\glft `I am going in the direction of Bhutan.’ (Plaisier 2007:82)} //
\endgl
\xe

Lepcha supposed output

What other information do you need?

Best Answer

There are multiple issues here (and the error message isn't for the reason you guessed):

  1. fontspec does not know about Lepcha,
  2. polyglossia does not know about Lepcha either, but that is ok,
  3. your expex syntax isn't correct.

fontspec

The package fontspec lets you use any font (installed on your computer) in the (now) standard OpenType format (such as Noto Sans Lepcha in this case), in XeTeX or LuaTeX.

Generally, LaTeX and its packages are not exactly known for helpful error messages, but fontspec is excellent in this respect. If your file starts with

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\lepchafont[Script=Lepcha]{Noto Sans Lepcha}

and you compile it with xelatex, you get the error message:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!
! LaTeX error: "kernel/key-choice-unknown"
! 
! Key 'fontspec/Script' accepts only a fixed set of choices.
! 
! See the LaTeX3 documentation for further information.
! 
! For immediate help type H <return>.
!...............................................  

l.3 ...lepchafont[Script=Lepcha]{Noto Sans Lepcha}

? 

If you've used LaTeX much you've probably despaired of ever getting anything useful by typing that H, but this case is an exception: typing H at that prompt gives:

? H
|'''''''''''''''''''''''''''''''''''''''''''''''
| The key 'fontspec/Script' only accepts predefined values, and 'Lepcha' is
| not one of these.
|...............................................
? 

which is perfectly clear: it means that fontspec doesn't recognize the [Script=Lepcha] in your

\newfontfamily\lepchafont[Script=Lepcha]{Noto Sans Lepcha}

If you look up the fontspec documentation by invoking texdoc fontspec (or by finding it online), you'll see in section 10.18 "OpenType scripts and languages" that the list of supported scripts is quite small, and Lepcha isn't in the list:

Table 13

There are two directions we can go from here:

  1. somehow getting Script=Lepcha to work
  2. actually understanding what's going on

Path 1: Making "Script=Lepcha" work

You'll also see in section 10.18.2 "Defining new scripts and languages" the way to define new scripts not included in the table: for example,

\newfontscript{Arabic}{arab}

where the first argument is the fontspec name, and the second is the OpenType tag.

What is the OpenType tag to use? The OpenType specification says it's lepc, so let's try that. Here's a minimal working example:

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontscript{Lepcha}{lepc}
\newfontfamily\lepchafont[Script=Lepcha]{Noto Sans Lepcha}

\begin{document}

Lepcha ahead: {\lepchafont ᰃᰨ ᰎᰥᰨᰜᰤᰦᰵ ᰀᰩᰰ ᰍᰩᰵᰡᰨ} -- that was Lepcha.

\end{document}

produces:

MWE with Lepcha defined

The compilation also produces this warning:

*************************************************
* fontspec warning: "script-not-exist"
* 
* Font 'Noto Sans Lepcha' does not contain script 'Lepcha'.
*************************************************

but the output is fine so we can ignore it. But should we?

Path 2: Understanding what's going on

If you actually read the fontspec documentation, you'll understand what the command

\newfontfamily\lepchafont[Script=Lepcha]{Noto Sans Lepcha}

is supposed to mean: it asks to load, whenever you write \lepchafont, the OpenType font "Noto Sans Lepcha", with the further information that you wish to use the font's OpenType features for script "Lepcha". This further information is intended for good multilingual fonts that contain different font features for different scripts.

In this case, if you get Noto Sans Lepcha in its released version and examine the font with ttx, you'll find that nowhere does the font contain any features specific to script tag lepc. Even the early-access version of Noto Sans Lepcha mentions lepc only in the font's GPOS table and not in its GSUB table as is the case with other fonts like, say, Noto Sans Kannada. In other words, Noto Sans Lepcha implements its complex text layout not using the new Lepcha-aware way, but using the default way with ligatures, under script tag "DFLT". You can also see this in the .log file produced by xelatex, which has this information from fontspec:

.................................................
. fontspec info: "defining-font"
. 
. Font family 'NotoSansLepcha(0)' created for font 'Noto Sans Lepcha' with
. options [Script=Lepcha].
. 
. This font family consists of the following shapes:
. 
. * 'normal' with NFSS spec.:
. <->"Noto Sans Lepcha/OT:"
. 
. * 'small caps' with NFSS spec.:
. 
. and font adjustment code:
. 
.................................................

Why does the font do things this way? You can see illuminating discussions here:

  • Bugzilla (comment by Jonathan Kew, associated with SIL and incidentally the developer of XeTeX): currently only harfbuzz and Windows 10 recognize the lepc tag; older shaping engines don't.
  • noto-fonts issue 451 and issue 395: the Unicode standard and Universal Shaping Engine specifications are still in debate for Lepcha.

Anyway, the point is that as Noto Sans Lepcha does not contain the script tag lepc, you should not use what was written in Path 1: you can use

\newfontscript{Lepcha}{DFLT}

(as "DFLT" (for "default") is the script tag used), but the more sensible thing to do is to simply load the font without specifying the additional information, which means that the below is the correct minimal example, one line shorter than earlier:

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\lepchafont{Noto Sans Lepcha}

\begin{document}

Lepcha ahead: {\lepchafont ᰃᰨ ᰎᰥᰨᰜᰤᰦᰵ ᰀᰩᰰ ᰍᰩᰵᰡᰨ} -- that was Lepcha.

\end{document}

polyglossia

The package polyglossia (see documentation by invoking texdoc polyglossia, or find it online) helps with a bunch of things that you may want to do when you switch languages in a document: change hyphenation patterns, change fonts and typographical conventions, change date and number formats, etc. For changing fonts for a particular language or script, it relies on the assumption that you defined it under the convention \<script>font or \<language>font.

Thus for example, instead of

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\hindifont[Script=Devanagari]{Noto Sans Devanagari}

\begin{document}

Hindi ahead: {\hindifont तुलसीदास श्रीरामचरितमानस} -- that was Hindi.

\end{document}

one can write

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\hindifont[Script=Devanagari]{Noto Sans Devanagari}

\usepackage{polyglossia}
\setdefaultlanguage{english}
\setotherlanguages{hindi}

\begin{document}

Hindi ahead: \texthindi{तुलसीदास श्रीरामचरितमानस} -- that was Hindi.

\end{document}

and get further benefits (in longer texts) beyond changing fonts, such as hyphenation patterns and so on. It turns out that polyglossia recognizes even fewer languages (and Lepcha isn't in the list):

languages known to polyglossia

but that is not a big problem for you: just remove "lepcha" (and if you have an older version of polyglossia, "khmer") from your list

\setotherlanguages{bengali, hindi, malayalam, khmer, french, greek, russian, thai, armenian, tibetan, lepcha}

expex

Finally, there seem to be errors in your example with expex:

  • there is a stray } in (Plaisier 2007:82)}
  • there is no indication that the Lepcha text is in a different language/script, and needs switching to \lepchafont (but see "bonus" section below).

With those fixed, here is a minimal working example:

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\lepchafont{Noto Sans Lepcha}

\usepackage{expex}    
\definelingstyle{Lepcha}{}

\begin{document}

\ex[lingstyle=Lepcha]<lepcha>
\begingl
\glpreamble \rightcomment{[Lepcha]}  {\lepchafont ᰃᰨ ᰎᰥᰨᰜᰤᰦᰵ ᰀᰩᰰ ᰍᰩᰵᰡᰨ} //
\gla go proljaŋ-kɔn nɔŋ-ʃo//
\glb 1\textsc{sg} Bhutan-side go-\textsc{nprt}//
\glft `I am going in the direction of Bhutan.' (Plaisier 2007:82) //
\endgl
\xe

\end{document}

which produces

Lepcha with expex

as in your question.


Bonus: ucharclasses

If you dislike having to explicitly set \lepchafont for the Lepcha text, you can use the package ucharclasses. As usual, you can see its documentation with texdoc ucharclasses or online: this package detects when some text in your input comes from a different Unicode block, and can execute a custom command for that text. You can use this to automate switching the font (or more, e.g. switch language for a language supported by polyglossia). So you can give your input as below, to get the same output as above:

\documentclass[12pt]{article}
\usepackage{fontspec}
\newfontfamily\lepchafont{Noto Sans Lepcha}

\usepackage[Lepcha]{ucharclasses}
\setTransitionTo{Lepcha}{\lepchafont}

\usepackage{expex}
\definelingstyle{Lepcha}{}

\begin{document}

\ex[lingstyle=Lepcha]<lepcha>
\begingl
\glpreamble \rightcomment{[Lepcha]}  ᰃᰨ ᰎᰥᰨᰜᰤᰦᰵ ᰀᰩᰰ ᰍᰩᰵᰡᰨ //
\gla go proljaŋ-kɔn nɔŋ-ʃo//
\glb 1\textsc{sg} Bhutan-side go-\textsc{nprt}//
\glft `I am going in the direction of Bhutan.' (Plaisier 2007:82) //
\endgl
\xe

\end{document}