[Tex/LaTex] How to Bengali v.2 (bng2) be used to typeset Bengali with XeTeX

fontsfontspecindicopentypexetex

This is a follow-up to this question about typesetting Bengali.

As explained in both answers there, XeLaTeX can typeset Bengali given a suitable font using polyglossia and fontspec. For example,

\documentclass[a4paper]{article}
\usepackage{geometry,fontspec,polyglossia}
\setmainlanguage[variant=british]{english}
\setotherlanguage{bengali}
\newfontfamily\bengalifont{Noto Sans Bengali}[Script=Bengali]
\begin{document}
x, y, z whatever\dots

\begin{bengali}
  আমি
\end{bengali}
\end{document}

produces

output

[Note that XeTeX is required. LuaTeX does not give correct results.]

As I explain in my answer, the font I used in this example actually offers two different versions of Bengali script, which correspond to two different OpenType scripts for Bengali:

beng            Bengali
bng2            Bengali v.2

As shown above, using the first one is straightforward.

What about the second? As Arun Debray explains in discussion following his answer, there are posts elsewhere suggesting this is possible, but the example which would have shown how to realise this possibility is no longer available at the sign-posted location. That is, the trail goes cold at this point.

Hence, Arun Debray and I thought this question worth asking:

How should the second be used?

Disclaimer: I know nothing whatsoever about Bengali. I am told that the sample above is Bengali and that the output is correct. However, if I was not told this, it could as easily be Psyptizamen and I would never know the difference.


Though this question is about Bengali, several other scripts (e.g. Devanagari, Tamil) have two OpenType versions, so whatever difference there is between these is not specific to just Bengali.

Best Answer

This has been answered in the comments on the question by Arun Debray (also at the other question) and by Will Robertson, fontspec developer; just turning it into an "answer" to get this question off the unanswered list.

Briefly, in the command

\newfontfamily\bengalifont{Noto Sans Bengali}[Script=Bengali]

the Script=Bengali is just a convenience, part of fontspec's pre-defined mapping of common names to OpenType script tags. As documented in section "Defining new scripts and languages" of the fontspec manual, you can define your own scripts with \newfontscript.

Thus, if you wish, you can forget about the default fontspec-defined Script=Bengali and define your own explicitly:

\documentclass[a4paper]{article}
\usepackage{geometry,fontspec,polyglossia}
\setmainlanguage[variant=british]{english}
\setotherlanguage{bengali}
\newfontscript{BengaliOpenTypeOld}{beng}
\newfontscript{BengaliOpenTypeNew}{bng2}
\newfontfamily\bengalifont{Noto Sans Bengali}[Script=BengaliOpenTypeNew]
\begin{document}
x, y, z whatever\dots

\begin{bengali}
  আমি
\end{bengali}
\end{document}

and switch between Script=BengaliOpenTypeNew and Script=BengaliOpenTypeOld as you wish.


Aside: The rest of this answer is completely tangential, but somewhat related to the motivation for asking this question (looking at the other question): the reason XeTeX is required and LuaTeX does not give correct results (currently) is that XeTeX uses the system libraries—such as Harfbuzz—for complex text layout aka text shaping (glyph reordering, glyph positioning, etc.), while LuaTeX hopes to minimize external dependencies and implement everything in Lua code, and this (IMO highly ambitious) work has, at the moment, simply not been done for Indic scripts other than reasonable support for Devanagari script and some basic support for Malayalam script. (See font-odv.lua in ConTeXt source code.)

For example, the word "আমি" consists of three Unicode "characters" (codepoints) in this order:

aami

where the glyph for the vowel-sign needs to be placed to the left of the consonant. This is done by Harfbuzz (or on Windows, possibly DirectWrite),

with harfbuzz

but in LuaTeX the glyphs are picked from the font and simply placed one after another meaninglessly:

lualatex

Related Question