[Tex/LaTex] verbatim and CJK

cjkfontstypewriterverbatim

I'm not familiar with chinese characters and fonts. But I have to type some characters into a document. XeLaTeX may be an option, but I'm looking for a solution with pdflatex. Here is my MWE, which I created with the help of other questions and answers:

\documentclass{article}
\usepackage{CJKutf8}
\begin{document}
\begin{CJK*}{UTF8}{gbsn}
中華人民共和國

ABC-1234
\begin{verbatim}
中華人民共和國
ABC-1234
\end{verbatim}
\end{CJK*}
\end{document}

I have two questions:

Is there a monospace font for chinese characters available for using in verbatim environments with same size than latin characters (I'm using MacTeX/TeXLive 2012)?
Why are not all 7 chinese symbols in the resulted pdf?

result of pdflatex

Best Answer

output of MWE

The ASCII characters ABC... are usually typeset as halfwidth characters while Chinese characters (汉字，　ＡＢＣ．．．) are typeset as fullwidth characters. See also Halfwidth and fullwidth forms.

Hence, a solution is to convert the ASCII characters to fullwidth ones in Unicode. You might write your own converter or use a website like http://kiserai.net/hwfw.pl
There are two writing systems for Chinese characters: the traditional and the simplified system. The characters 華 and 國 belong to the traditional characters, while the others can be used in both systems. Unfortunately, there is no one-to-one mapping between traditional characters and simplified ones. Several traditional characters can be mapped to one simplified one.

Hence, when using a CJK environment you also need to decide which writing system, i.e., which font to use: gbsn and gkai are fonts with simplified characters, while bsmi and bkai are fonts with traditional characters. See also Problems of traditional and simplified Chinese characters and CJK environment and Chinese

Here is the code of the above picture:

\documentclass{article}
\usepackage{CJKutf8}
\begin{document}
\begin{CJK}{UTF8}{bsmi}
中華人民共和國

ABC-1234
\begin{verbatim}
中華人民共和國
ＡＢＣ－１２３
４
\end{verbatim}
\end{CJK}
\end{document}

Related Solutions

[Tex/LaTex] Han characters replace Hiragana when using Meiryo font with CJK

I don't suppose you could consider using XeTeX or LuaTeX instead to compile your document? For example, here's a modified version of your minimal example (which for some reason seems to have lots of errors in it):

\documentclass[a4paper]{article}
\usepackage{fontspec}
\setmainfont[Ligatures=TeX]{Hiragino Mincho Pro}
\usepackage{ruby}
\renewcommand{\rubysize}{0.7}
\renewcommand{\rubysep}{-0.8pt}
\newcommand{\rdf}[3]{\ruby{#1}{#2} -- #3}
\begin{document}
    \begin{tabular}{l}
      \rdf{映画}{えいが}{Movie}\\
      \rdf{上映}{じょうえい}{Show a movie}\\
      \rdf{放映}{ほうえい}{Televising}
    \end{tabular}
\end{document}

[Tex/LaTex] Using CJK (Chinese/kanji) characters in math using unicode-math on lualatex

The characters to be used in math mode are from CJK languages. In general these characters can be considered as ordinary symbols. According to the math classification -- see also my explanation below! -- there are two such classes: 0 and 7. Typesetting of CJK languages is different from typesetting languages with alphabets. E.g., traditionally CJK languages do not use italics for emphasis (but may have other means to do so). If italics, bold shape ... do not existing for such a font and \mathit, \mathbf, ... cannot be used then it seems appropriate to choose class 0 instead of class 7. Actually by default, a unicode character "zzzzzz ("0 - "10FFFF) is assigned Umathcode "0"0"zzzzzz. Hence, the character is already considered as an ordinary symbol of font family 0 and no change is necessary.

But it seems that \setmathfont (unicode-math, version 0.7c) is not working properly. As a workaround we define the command \adjustmathfont that uses a counter my@char to steps through the range from the first index #1 to the last index #2. At each step we adjust the font family by \Umathcode\value{my@char} = "0 #3 \value{my@char} to the font family given by the third argument #3. For example, if #1 and #2 are equal to "7121 and #3 is equal to "4 this just produces \Umathcode"7121="0"4"7121. The full code in a MWE follows.

\documentclass{article}
\usepackage{fontspec}
\usepackage{unicode-math}
\setmainfont{Linux Libertine O}
\newfontfamily\cjkfont{Kochi Mincho}

%------ workaround ------
\usepackage{etoolbox}
\makeatletter

%usage: \adjustmathfont{arg1}{arg2}{arg3}
%   where  arg1 is the beginning of the unicode range, e.g. "4E00
%          arg2 is the end of the unicode range, e.g. "9FFF
%          arg3 is the font number, e.g. "4
\newcounter{my@char}
\newcommand{\adjustmathfont}[3]{%
  \ifnumgreater{#1}{#2}{%
    \PackageWarning{}{No adjustment of math font since #1 is greater than #2.}
  }{
    \setcounter{my@char}{#1}
    \Umathcode\value{my@char}="0 #3 \value{my@char}
    \whileboolexpr{%
      test {\ifnumless{\value{my@char}}{#2}} 
    }{%
      \stepcounter{my@char}
      \Umathcode\value{my@char}="0 #3 \value{my@char}
    }
  }
}
\makeatother
%------------------------

\setmathfont{XITS Math}
\setmathfont[range={"4E00-"9FFF}]{Kochi Mincho}
%the new math font (here "Kochi Mincho") might use font number 4 or higher;
%please see @Gro-Tsen's comment how to automate this;
\adjustmathfont{"4E00}{"9FFF}{"4}

\begin{document}
Hello, world!  Здравствуй, мир!  Unicode est vraiment \emph{épatant}!  \cjkfont{漢字}

$\mathbf{Δ} = (Δ_ι)_{ι∈I}$  $無_無^無 = ∅$
\end{document}

BTW, the usage of \cjkfont could be avoided by using an approach as shown in this blog. For example, the package fontspec can be replaced by ctex and \setCJKmainfont{Kochi Mincho} needs to be added. Then \cjkfont is not needed.

Some details about math mode

Math mode has different rules from "normal" text typesetting. In math mode each character is assigned a "mathcode" (hexadecimal "xyzz), which tells how to print that character. The mathcode consists of three parts: the "math class" x, the font family y, the position zz of the character in that font family.

The class x controls several aspects of typesetting of a character, especially the spacing, and can take following eight values: 0: ordinary symbol, 1: large operator, 2: binary operator, 3: relation, 4: opening symbol, 5: closing symbol, 6: punctuation, 7: variable family (= oridnary symbol except that \fam is choosen instead of y if \fam in the range 0-15). The font family y is from the range 0-15. The position zz is from the range 0-255.

For example, the mathcode of the symbol \, is set by \mathcode`\,="613B which means that \, is considered as punctuation and typeset by using the symbol "3B of font family 1. More examples can be found in the file "tex/plain/base/plain.tex".

Nowadays computers are much less restricted than some decades ago. Thus, by using the package unicode-math the ranges of the mathcode are extended: for the font family to yy (8 bits) and for the charater positions to zzzzzz (ranging "0 to "10FFFF, about 21 bits) to suit Unicode fonts. The extended fields can be set by \Umathcode"zzzzzz="x"yy"zzzzzz, for example, \Umathcode\leftarrow="3"0"02190. (For details, see the luatexref documentation mentioned here.)

Best Answer

Related Solutions

[Tex/LaTex] Han characters replace Hiragana when using Meiryo font with CJK

[Tex/LaTex] Using CJK (Chinese/kanji) characters in math using unicode-math on lualatex

Related Question