Confusion about unicode-math symbols w.r.t. math/text mode and format

luatexmath-modeunicodeunicode-mathxetex

Many math symbols defined by unicode-math either "work" in text mode or at least don't produce errors. For example, taking a representative sample from unicode-math-table.tex, the following code compiles with lualatex

\documentclass{article}
\usepackage{unicode-math,array,shortvrb}
\newcolumntype{M}{>{$}c<{$}}
\MakeShortVerb{\|}

\begin{document}

\begin{tabular}{ r | M | c | l}
Command & \text{Math mode} & Text mode & Symbol class \\
\hline
|\twolowline| & \twolowline & \twolowline & |\mathord| \\
|\dagger| & \dagger & \dagger & |\mathbin| \\
|\ddagger| & \ddagger & \ddagger & |\mathbin| \\
|\smblkcircle| & \smblkcircle & \smblkcircle & |\mathbin| \\
|\enleadertwodots| & \enleadertwodots & \enleadertwodots & |\mathord| \\
|\unicodeellipsis| & \unicodeellipsis & \unicodeellipsis & |\mathord| \\
|\hyphenbullet| & \hyphenbullet & \hyphenbullet & |\mathord| \\
|\fracslash| & \fracslash & \fracslash & |\mathbin| \\
|\Question| & \Question & \Question & |\mathord| \\
|\closure| & \closure & \closure & |\mathrel| \\
|\qprime| & \qprime & \qprime & |\mathord| \\
|\euro| & \euro & \euro & |\mathord| \\
|\enclosecircle| & \enclosecircle & \enclosecircle & |\mathord| \\
|\enclosesquare| & \enclosesquare & \enclosesquare & |\mathord| \\
|\BbbZ| & \BbbZ & \BbbZ & |\mathalpha| \\
|\mho| & \mho & \mho & |\mathord| \\
|\mfrakZ| & \mfrakZ & \mfrakZ & |\mathalpha| \\
|\turnediota| & \turnediota & \turnediota & |\mathalpha| \\
|\Angstrom| & \Angstrom & \Angstrom & |\mathalpha| \\
|\mscrB| & \mscrB & \mscrB & |\mathalpha| \\
|\mfrakC| & \mfrakC & \mfrakC & |\mathalpha| \\
|\mscre| & \mscre & \mscre & |\mathalpha| \\
|\leftarrow| & \leftarrow & \leftarrow & |\mathrel| \\
|\uparrow| & \uparrow & \uparrow & |\mathrel| \\
|\rightarrow| & \rightarrow & \rightarrow & |\mathrel| \\
|\downarrow| & \downarrow & \downarrow & |\mathrel| \\
|\leftrightarrow| & \leftrightarrow & \leftrightarrow & |\mathrel| \\
|\updownarrow| & \updownarrow & \updownarrow & |\mathrel| \\
|\nwarrow| & \nwarrow & \nwarrow & |\mathrel| \\
|\nearrow| & \nearrow & \nearrow & |\mathrel| \\
\end{tabular}
\end{document}

With lualatex at least, every symbol defined with \UnicodeMathSymbol except those in the classes \mathopen, \mathclose, \mathaccent(wide), \mathaccentoverlay, \mathbotaccent(wide), \mathover, \mathunder will compile in text mode without error. Whether or not a symbol is produced depends on the font, and as can be seen above, the symbol in text mode is often slightly different (expected, as they're coming from different Unicode blocks (I think)).

However, if you try to compile with xelatex, you'll get an error ! Missing $ inserted. for \qprime. It compiles if you remove that line, but in general fewer symbols work in text mode with xelatex.

Last observation: Without unicode-math, e.g.

\documentclass{article}
\begin{document}
\dagger
\end{document}

compiles without error under lualatex, but produces no symbol. With xelatex, it throws an error. Add \usepackage{unicode-math} and it compiles and produces a symbol with both formats.

My questions are:

  1. Why does lualatex not produce errors when math symbols are used outside of math mode? More generally, why the different behavior between lualatex and xelatex here?
  2. Is the choice to make symbols work in text mode intentional on the part of unicode-math?

A guess is that \UnicodeMathSymbol tells \dagger to expand to the literal Unicode character which is then interpreted by lualatex or xelatex in the proper context (math/text). As long as there's no extra code (like for \mathaccent) that requires math mode, then the "Math" part of \UnicodeMathSymbol is really not important. But this is only a guess, and does not explain why \qprime produces an error outside math mode for xelatex.

Best Answer

\documentclass{article}
\begin{document}
\show\dagger\dagger
\end{document}

shows that \dagger isn't a macro it is defined as in classic tex via \mathchardef. In classic TeX and XeTeX using a \mathchar token is an error but a documented change in luatex is that a \mathchar in text is not an error, the mathclass is ignored and it acts like \char selecting a character from the current font.

Adding unicode-math you will see, as you guessed, the command is a simple \def macro to a character. So like + works in text and in math is controlled by the \mathcode of the character

 \dagger=the character †

\qprime is not a macro but defined va \Umathchar

\qprime=\Umathchar"0"00"002057.

This acts like \mathchar and is an error in text in xetex but luatex ignores the first two components and acts like \char"002057

\documentclass{article}
\usepackage{unicode-math}
\begin{document}
\show\alpha\alpha
\show\dagger\dagger
\show\qprime\qprime
\end{document}
Related Question