I am trying to make a book in both print and ebook. To make the print book I generate a PDF with xelatex. I have reconstructed all the source files to be encoded in UTF-8 and the Cyrillic and Japanese characters are appearing correctly.
To get an ebook, you generate HTML. For this I am using htlatex which appears to use TeX4ht to do the work.
htlatex "mybook.tex" "xhtml, charset=utf-8" " -cunihtf -utf8"
But when generating the HTML, I get errors in the log saying:
Missing character: There is no ½ in font cmr10!
It appears to complain that all the characters are not in the font, so it simply doesn't output the character. However, we all know that browsers support these characters.
I don't really want to select a font, because I would like the HTML file to not specify the font — I want to use the default font of the browser or ebook reader. Yet, somehow I need to tell LaTeX to quit being so paranoid, and just go ahead and output the character anyway. Is there any way to do that?
I have read other places that I need to select a "unicode" font. With xelatex I use the fontspec
package, but that is apparently not allowed by Tex4ht program. (I get an error saying such).
I set the following in the preamble:
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
I suspect this is wrong since T1 means Latin1 characters? I need Cyrillic and Japanese. Anyway, when I do this, I get new error:
Package inputenc Error: Unicode char \u8:¥ not set up for use with LaTeX.
Yes, it is not set up right. So, my question is: how do I select a font, or how do I otherwise convince htlatex to simply ALLOW all the characters to be written to the HTML file?
Here is a MWE that demonstrates the problem:
\documentclass{book}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\begin{document}
\frontmatter
\mainmatter
The flight was 4½ hours with a 3 hour change.
We landed in Honolulu. It was 85° and after
rearranging the luggage which took a little time,
ナンシースエンソン quickly got heated up.
\end{document}
my compile script is
htlatex "MWE.tex" "xhtml, charset=utf-8" " -cunihtf -utf8"
Best Answer
Your example doesn't work even with
pdflatex
, so it isn't really surprise that it doesn't work withtex4ht
as well. Easiest solution to get non-european scripts working withtex4ht
is to use helpers4ht bundle, in particular emulation of fonstspec package.helpers4ht
aren't on CTAN yet, you need to install it yourself.Now back to your example, it is little bit more difficult because of Japanese, which needs to be handled by some package, like
xeCJK
for XeTeX orluatexja
for LuaTeX. Both of these packages aren't supported bytex4ht
, but we can load them using alternative package loader, which will suppress them whentex4ht
is loaded:You need to compile this example with LuaTeX as engine for
tex4ht
, for example with this command:LuaTeX is needed even if your document is normally compiled with XeTeX, because Lua callbacks are used to convert Unicode to suitable form for
tex4ht
.The result