[Tex/LaTex] Why to use [T2A, T1] when [T2A] is enough

best practicescyrillicfont-encodingsfontsmacros

I am writing some seminar in Cyrillic for university

\documentclass[a4paper,14pt]{extarticle}
\usepackage[T2A,T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{droid}

\DeclareRobustCommand{\cyr}[1]{%
  {\fontencoding{T2A}\selectfont#1}%
}

\renewcommand{\contentsname}{\cyr{Садржај}}
\renewcommand{\refname}{\cyr{Литература}}


\begin{document}

\begin{titlepage}
\vspace*{\stretch{2}}
\begin{center}
\huge
\cyr{Назив семинарског} \\

\vspace{\stretch{4}}
%\normalsize
\cyr{Име и презиме}
\end{center}
\end{titlepage}

\tableofcontents
\clearpage

\section{\cyr{Глава 1}}

\cyr{ЋЧСАЋЧСАЋ АСААС ШЂСАШЂ ЊЉЖ СААЋЖ СЦАБЕК АСАЖЋ сажћ ћафа ћжасжћашђ асдажћ}



\section{\cyr{Глава 2}}

text $math$

English alphabet W Q 

$\int\limits_0^1 \frac{x^2-3x}{\sqrt{x-1}}\,dx$


\section{\cyr{ПОГЛАВЉЕ}}

\end{document}

But when I use

\documentclass{article}
\usepackage[T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{droid}

I don't see any difference. So I don't have T1 fonts included. I mean, here I can also write English alphabet using English keyboard and I write here Cyrillic using directly Cyrillic letter. And here I didn't define \cyr because I have tried different ways.

So mine question is: why should I use first way instead second? I suppose there is different between method 1 and 2, but which one?

Best Answer

The character maps of T2A and T1 are identical in the first 128 positions. So english texts should work fine with T2A. But if you want to insert umlauts and other accented chars from the upper part of the char table of T1-encoding you should consider to switch to T1 for such texts instead of relying on LaTeX to fall back to the correct definitions.

Additional notes:

inputenc is used for text of 8 bits encodings. Knuth's tradational TeX can use ASCII (0x00--0x7F) only. Later TeX implementations can process 8 bits extended ASCII (0x00--0xFF).

Most western encodings are some types of extended ASCII, e.g. Latin 1 (ISO 8859-1), KOI8-R, CP866. The ASCII part are identical, but higher codes (0x80--0xFF) are different.

Fonts has there own encodings. Basically, a font encoding corresponds with a input encoding. For example, T1 encoding is very similar to Latin 1 encoding, and T2A is very similar to Code Page 1251 in Windows. That is to say, if you TeX a source file of CP1251, and use \usepackage[T2A]{fontenc}, you will get proper Russian text without any other packages.

inputenc makes us to input text with different encodings, but use fonts of only one font encoding. When you choose utf8x option, and use T2A font encoding, then the characters are mapped to T2A (CP1251) from UTF-8 inputs. That's what inputenc do.

How inputenc works? It makes the characters active (catcode 13), and do the mapping. For multibyte characters, things may be more complex.

Why inputenc cannot be used in XeTeX and LuaTeX? XeTeX and LuaTeX use UTF-8 encoding for input. A multibyte Unicode character in XeTeX/LuaTeX is only one character, but in pdfTeX it's two or three. Thus the different TeX engines sees the input stream differently. inputenc fails in XeTeX and LuaTeX.

I'm not very familar with LuaTeX. In XeTeX, there is a special mode for compatibility. You can use \XeTeXinputencoding to change the input encoding temporarily, and the "bytes" encoding makes XeTeX works like a 8 bit TeX engine. Thus, this works for XeTeX:

\XeTeXinputencoding "bytes"
\documentclass{article}

\usepackage[T2A]{fontenc}
\usepackage[utf8x]{inputenc}

\begin{document}

Русский язык

\textsf{Русский язык}

\texttt{Русский язык}

\end{document}

(This technique is used by obsolete xCJK package.)

However, inputenc may cause real Unicode fonts with fontspec works unusually. It is still not a good idea to use it.

By Khaled Hosny

luainputenc can be used here. I didn't know that.

\documentclass{article}

\usepackage[T2A]{fontenc}
\usepackage[lutf8x]{luainputenc}


\begin{document}

Русский язык

\textsf{Русский язык}

\texttt{Русский язык}

\end{document}

[Tex/LaTex] TeXLive/PDFTeX fonts loading problem

The font larm1000 is the base font for the cyrillic encoding T2A that you are choosing to be the default for your document, as it is the last appearing in

\usepackage[T1,T2A]{fontenc}

This should be corrected, if the default language is English, as specified by

\usepackage[hebrew,english]{babel}

In order to avoid the problem, install with tlmgr all the packages in collection-langcyrillic, in particular the lh package.

Best Answer

Related Solutions

[Tex/LaTex] Can one instruct LuaLaTeX to use T2A encoded fonts

Additional notes:

By Khaled Hosny

[Tex/LaTex] TeXLive/PDFTeX fonts loading problem

Related Question