[Tex/LaTex] TexWorks used to output PDF from TeX file result in XeTeX required error

pdftextexworksxetex

I'm new with TeX and Linux, but I want to learn to use it.

I found an open source TeX file from here and I want to be able to compile from the cv.tex file to cv.pdf. After searching around, I found pdfLaTeX seems to be what I was looking for. However, using the command line, it gives me the error of XeTeX is required to compile this document. I do have XeTeX and when I try to use XeTeX, it gives me the error of Undefined control sequence.

Since I am new to Linux and googling increases my confusion, I decided to use TeXWorks which is an IDE I hope will have all the dependencies to compile the .tex file to PDF. Upon launching, I believe the "compiler" was set to pdfLaTeX and I click on the "green play button" and it did create a new window with the right content on it. I was ecstatic. I started modifying the .tex file, "save" and hit the "green button" again but this time I get the same errors as when I do it through the command line. I get XeTeX required when I use pdfLaTeX and Undefined control sequence when I use XeTeX.

Can someone please guide me to the right path?

Additional info: If I close TeXworks and open it again, I can the original output cv.pdf on another window however, it does not include my changes to cv.tex. I also tried XeLaTeX and it does not compile correctly.

The error I get when use XeLaTeX is:

LaTeX error file: unicode-math.sty not found

Best Answer

Set the typesetting engine to »XeLaTeX« in the corresponding pull down menu of TeXworks (see picture). Alternatively you can add % !TEX program = xelatex as very first line to the source code and TeXworks will choose XeLaTeX automatically every time.

enter image description here

Related Solutions

[Tex/LaTex] pdflatex error when rendering TeX as PDF in TeXstudio on Windows 7

The error message "Fatal format file error; I'm stymied" means that TeX binary is trying to load latex.fmt (or pdflatex.fmt) but the version of this TeX binary differs from another TeX binary which created such latex.fmt. There could be two reasons: you have installed two TeX binaries (in different versions) or you have somewhere in your computer the old latex.fmt from previous TeX installation.

The pdflatex.exe must be implemented as one what runs TeX binary (most probably pdftex.exe) and sets the message to this binary: "hey, load pdflatex.fmt". And this loading is broken as described above. I don't know how exactly it is implemented in MikTeX, sorry. I never used MS Windows.

New TeX distributions has more TeX binaries: "tex", "pdftex", "luatex" and "xetex". If one of them creates file.fmt and another of them reads such file.fmt then the error message mentioned above occurs too. This is reason why TeX distributions save the generated file.fmt to directories specific for used TeX binary and they have implemented a searching system over such directory trees.

[Tex/LaTex] producing pdf from markdown with pandoc and xelatex generate misleading error messages

It is indeed true that XeTeX can produce invalid UTF-8 in its error output, and I can reproduce this with the following simpler .tex file:

\documentclass{article}
\begin{document}
应该把 123456789 123456789 123 \textwidth换成
\end{document}

So you can consider this either a bug in XeTeX (for producing invalid UTF-8) or in Pandoc (for incorrectly assuming that XeTeX will produce valid UTF-8).

Unicode and UTF-8

The problem, in short, is that you cannot just break a sequence of UTF-8 bytes in any arbitrary place. To take an example, in the string 应该把, the characters are:

U+5E94 CJK UNIFIED IDEOGRAPH-5E94, encoded in UTF-8 as E5 BA 94
U+8BE5 CJK UNIFIED IDEOGRAPH-8BE5, encoded in UTF-8 as E8 AF A5
U+628A CJK UNIFIED IDEOGRAPH-628A, encoded in UTF-8 as E6 8A 8A

So the string as a whole is encoded in UTF-8 as a sequence of 9 bytes:

E5 BA 94 E8 AF A5 E6 8A 8A
\______/ \______/ \______/
   应       该       把

You can break the byte sequence after 0, 3, 6, or 9 bytes to get a valid string containing 0, 1, 2 or 3 characters respectively. But breaking it at some other place results in invalid UTF-8.

Unfortunately, that is exactly what XeTeX can do: it can break the byte sequence in some such place, resulting in invalid UTF-8 that Pandoc then fails to cope with (because it assumes valid UTF-8).

Explanation

In the first place, in Unicode-aware engines like XeTeX and LuaTeX, all unicode characters can be part of control sequences, and there happens to be no control sequence named \textwidth换成 so the system generates an error about an undefined control sequence.

Then when printing out this error to the terminal, TeX tries to add additional context around where this undefined control sequence \textwidth换成 was encountered, which means some additional characters surrounding the occurrence, to fill error_line characters. (This can be increased; see here and here. Though increasing this is a good idea anyway and decreases the likelihood of this error happening; it can still happen with sufficiently long lines (and does happen with the example in the question), because the max value of error_line is still only 254.)

Unfortunately (and this is the bug), it appears that XeTeX counts by bytes and truncates the output without regard for breaking only at well-defined Unicode code-point sequences. Look for procedure show_context in the XeTeX source code, and compare with print_valid_utf8 in the LuaTeX source code, used in its show_context.

In this example, XeTeX picks up only the last two bytes of the first word (the 8A 8A), which is not a valid UTF-8 sequence. That is why iconv and Pandoc complain.

Demonstration

The commands I used for compiling the above .tex file with LuaTeX and XeTeX are respectively:

lualatex -interaction=nonstopmode test.tex | iconv -f UTF8

and

xelatex -interaction=nonstopmode test.tex | iconv -f UTF8

With the former (LuaTeX), I get the error message:

! Undefined control sequence.
l.3 ...把 123456789 123456789 123 \textwidth换成

but with the latter (XeTeX), I get an error message that is not valid UTF-8, so iconv fails with

iconv: (stdin):11:7: cannot convert

Without iconv, on my terminal I see printed:

! Undefined control sequence.
l.3 ...?? 123456789 123456789 123 \textwidth换成

and by redirecting the output to a file and viewing it in a raw editor, we can see better what's going on. The following is hexdump output from xxd -g 1 -c 32:

000001c0: 78 29 0a 21 20 55 6e 64 65 66 69 6e 65 64 20 63 6f 6e 74 72 6f 6c 20 73 65 71 75 65 6e 63 65 2e  x).! Undefined control sequence.
000001e0: 0a 6c 2e 33 20 2e 2e 2e 8a 8a 20 31 32 33 34 35 36 37 38 39 20 31 32 33 34 35 36 37 38 39 20 31  .l.3 ..... 123456789 123456789 1
00000200: 32 33 20 5c 74 65 78 74 77 69 64 74 68 e6 8d a2 e6 88 90 0a 20 20 20 20 20 20 20 20 20 20 20 20  23 \textwidth.......

Note the 8a 8a (the last two bytes of 把 = E6 8A 8A) just after the ellipsis (2e 2e 2e meaning ...).