[Tex/LaTex] Can one instruct LuaLaTeX to use T2A encoded fonts

cyrillicfontsluatex

As it is stated in answer to Can XeTeX | LuaTeX use MetaFont fonts?, LuaLaTeX can use MetaFont fonts. If so, how can one instruct LuaLaTeX to use T2A encoded fonts (e.g. cm-super) to typeset cyrillic text? I used to think that LuaLaTeX can use only those fonts which are internally encoded accoding to Unicode. Am I right?

UPDATE:
The following code (slightly modifed from @Leo Liu's answer) prints "Б-Б-Б" if it is compiled by pdflatex but "Б–Б" in case of lualatex:

\documentclass{minimal}
\usepackage[T2A]{fontenc} % cyrillic font encoding
\usepackage[utf8]{inputenc} % source file encoding
\begin{document}
\symbol{"C1}-Б-\CYRB
\end{document}

I wonder why \usepackage[utf8]{inputenc} does not work for lualatex. I tried also luainputenc with same result. Note that \CYRB command is defined by fontenc which loads t2aenc.def. As to inputenc it should match unicode chars in source file to corresponding \cyr... macros. And it did so for pdflatex but not lualatex.

Best Answer

You can use it directly. fontenc works well with XeTeX/LuaTeX. For example,

\documentclass{article}
\usepackage[T2A]{fontenc}

\begin{document}

\symbol{"C1} % produce Б

\end{document}

However, you can't use it with fontspec. A error will be produced:

! Corrupted NFSS tables.

Moreover, it is a problem to input Cyrillic text. inputenc cannot be used in XeTeX and LuaTeX directly.

Anyway, if you want to use Cyrillic text in XeLaTeX and LuaTeX, just use CM Unicode fonts. For example,

\documentclass{article}
\usepackage{fontspec}
\setmainfont{CMU Serif}
\setsansfont{CMU Sans Serif}
\setmonofont{CMU Typewriter Text}

\begin{document}

Русский язык

\textsf{Русский язык}

\texttt{Русский язык}

\end{document}

Additional notes:

inputenc is used for text of 8 bits encodings. Knuth's tradational TeX can use ASCII (0x00--0x7F) only. Later TeX implementations can process 8 bits extended ASCII (0x00--0xFF).

Most western encodings are some types of extended ASCII, e.g. Latin 1 (ISO 8859-1), KOI8-R, CP866. The ASCII part are identical, but higher codes (0x80--0xFF) are different.

Fonts has there own encodings. Basically, a font encoding corresponds with a input encoding. For example, T1 encoding is very similar to Latin 1 encoding, and T2A is very similar to Code Page 1251 in Windows. That is to say, if you TeX a source file of CP1251, and use \usepackage[T2A]{fontenc}, you will get proper Russian text without any other packages.

inputenc makes us to input text with different encodings, but use fonts of only one font encoding. When you choose utf8x option, and use T2A font encoding, then the characters are mapped to T2A (CP1251) from UTF-8 inputs. That's what inputenc do.

How inputenc works? It makes the characters active (catcode 13), and do the mapping. For multibyte characters, things may be more complex.

Why inputenc cannot be used in XeTeX and LuaTeX? XeTeX and LuaTeX use UTF-8 encoding for input. A multibyte Unicode character in XeTeX/LuaTeX is only one character, but in pdfTeX it's two or three. Thus the different TeX engines sees the input stream differently. inputenc fails in XeTeX and LuaTeX.

I'm not very familar with LuaTeX. In XeTeX, there is a special mode for compatibility. You can use \XeTeXinputencoding to change the input encoding temporarily, and the "bytes" encoding makes XeTeX works like a 8 bit TeX engine. Thus, this works for XeTeX:

\XeTeXinputencoding "bytes"
\documentclass{article}

\usepackage[T2A]{fontenc}
\usepackage[utf8x]{inputenc}

\begin{document}

Русский язык

\textsf{Русский язык}

\texttt{Русский язык}

\end{document}

(This technique is used by obsolete xCJK package.)

However, inputenc may cause real Unicode fonts with fontspec works unusually. It is still not a good idea to use it.


By Khaled Hosny

luainputenc can be used here. I didn't know that.

\documentclass{article}

\usepackage[T2A]{fontenc}
\usepackage[lutf8x]{luainputenc}


\begin{document}

Русский язык

\textsf{Русский язык}

\texttt{Русский язык}

\end{document}