The characters to be used in math mode are from CJK languages. In general these characters can be considered as ordinary symbols. According to the math classification -- see also my explanation below! -- there are two such classes: 0 and 7. Typesetting of CJK languages is different from typesetting languages with alphabets. E.g., traditionally CJK languages do not use italics for emphasis (but may have other means to do so). If italics, bold shape ... do not existing for such a font and \mathit
, \mathbf
, ... cannot be used then it seems appropriate to choose class 0 instead of class 7. Actually by default, a unicode character "zzzzzz
("0
- "10FFFF
) is assigned Umathcode "0"0"zzzzzz
. Hence, the character is already considered as an ordinary symbol of font family 0 and no change is necessary.
But it seems that \setmathfont
(unicode-math, version 0.7c) is not working properly. As a workaround we define the command \adjustmathfont
that uses a counter my@char
to steps through the range from the first index #1
to the last index #2
. At each step we adjust the font family by \Umathcode\value{my@char} = "0 #3 \value{my@char}
to the font family given by the third argument #3
. For example, if #1
and #2
are equal to "7121
and #3
is equal to "4
this just produces \Umathcode"7121="0"4"7121
. The full code in a MWE follows.
\documentclass{article}
\usepackage{fontspec}
\usepackage{unicode-math}
\setmainfont{Linux Libertine O}
\newfontfamily\cjkfont{Kochi Mincho}
%------ workaround ------
\usepackage{etoolbox}
\makeatletter
%usage: \adjustmathfont{arg1}{arg2}{arg3}
% where arg1 is the beginning of the unicode range, e.g. "4E00
% arg2 is the end of the unicode range, e.g. "9FFF
% arg3 is the font number, e.g. "4
\newcounter{my@char}
\newcommand{\adjustmathfont}[3]{%
\ifnumgreater{#1}{#2}{%
\PackageWarning{}{No adjustment of math font since #1 is greater than #2.}
}{
\setcounter{my@char}{#1}
\Umathcode\value{my@char}="0 #3 \value{my@char}
\whileboolexpr{%
test {\ifnumless{\value{my@char}}{#2}}
}{%
\stepcounter{my@char}
\Umathcode\value{my@char}="0 #3 \value{my@char}
}
}
}
\makeatother
%------------------------
\setmathfont{XITS Math}
\setmathfont[range={"4E00-"9FFF}]{Kochi Mincho}
%the new math font (here "Kochi Mincho") might use font number 4 or higher;
%please see @Gro-Tsen's comment how to automate this;
\adjustmathfont{"4E00}{"9FFF}{"4}
\begin{document}
Hello, world! Здравствуй, мир! Unicode est vraiment \emph{épatant}! \cjkfont{漢字}
$\mathbf{Δ} = (Δ_ι)_{ι∈I}$ $無_無^無 = ∅$
\end{document}
BTW, the usage of \cjkfont
could be avoided by using an approach as shown in this blog. For example, the package fontspec
can be replaced by ctex
and \setCJKmainfont{Kochi Mincho}
needs to be added. Then \cjkfont
is not needed.
Some details about math mode
Math mode has different rules from "normal" text typesetting. In math mode each character is assigned a "mathcode" (hexadecimal "xyzz
), which tells how to print that character. The mathcode consists of three parts: the "math class" x
, the font family y
, the position zz
of the character in that font family.
The class x
controls several aspects of typesetting of a character, especially the spacing, and can take following eight values: 0: ordinary symbol, 1: large operator, 2: binary operator, 3: relation, 4: opening symbol, 5: closing symbol, 6: punctuation, 7: variable family (= oridnary symbol except that \fam is choosen instead of y
if \fam in the range 0-15). The font family y
is from the range 0-15. The position zz
is from the range 0-255.
For example, the mathcode of the symbol \,
is set by \mathcode`\,="613B
which means that \,
is considered as punctuation and typeset by using the symbol "3B of font family 1. More examples can be found in the file "tex/plain/base/plain.tex".
Nowadays computers are much less restricted than some decades ago. Thus, by using the package unicode-math
the ranges of the mathcode are extended: for the font family to yy
(8 bits) and for the charater positions to zzzzzz
(ranging "0
to "10FFFF
, about 21 bits) to suit Unicode fonts. The extended fields can be set by \Umathcode"zzzzzz="x"yy"zzzzzz
, for example, \Umathcode\leftarrow="3"0"02190
. (For details, see the luatexref documentation mentioned here.)
Best Answer
The ASCII characters
ABC...
are usually typeset as halfwidth characters while Chinese characters (汉字, ABC...
) are typeset as fullwidth characters. See also Halfwidth and fullwidth forms.Hence, a solution is to convert the ASCII characters to fullwidth ones in Unicode. You might write your own converter or use a website like http://kiserai.net/hwfw.pl
There are two writing systems for Chinese characters: the traditional and the simplified system. The characters
華
and國
belong to the traditional characters, while the others can be used in both systems. Unfortunately, there is no one-to-one mapping between traditional characters and simplified ones. Several traditional characters can be mapped to one simplified one.Hence, when using a
CJK
environment you also need to decide which writing system, i.e., which font to use:gbsn
andgkai
are fonts with simplified characters, whilebsmi
andbkai
are fonts with traditional characters. See also Problems of traditional and simplified Chinese characters andCJK
environment and ChineseHere is the code of the above picture: