[Tex/LaTex] How to Convert XeTeX to HTML

conversionhtml

I realize its too much to post the entire document. Here is a small portion of document, it just keep repeating itself for various verses. Sorry this is the minimum example size I could squeeze it to.

\documentclass[a4paper,12pt]{article}
%\usepackage[left=0.15in,right=0.15in,top=0.75in,bottom=0.75in]{geometry}
\usepackage{fontspec}
\usepackage{xltxtra}
\usepackage{polyglossia}
\usepackage{longtable}
\usepackage[doublespacing]{setspace}
\usepackage{verse}
\usepackage{varwidth}

\AtBeginEnvironment{Verse}{\singlespacing}

\AtBeginEnvironment{longtable}{\onehalfspacing}

\newcommand{\blt}{\begin{longtable}{p{1.5cm}p{0.05cm}p{3.5cm}|p{1.5cm}p{0.05cm}p{3.5cm}|p{1.5cm}p{0.05cm}p{3.5cm}}}
\newcommand{\elt}{\end{longtable}}

\newcommand{\bv}{\begin{Verse}\large}
\newcommand{\ev}{\end{Verse}}

\newcommand{\bfns}{\begin{footnotesize}}
\newcommand{\efns}{\end{footnotesize}}

\newcommand{\sz}{\\[\normalbaselineskip]}
\newcommand{\csec}[1]{\section{#1}}

\newenvironment{Verse}
  {\center\varwidth{\linewidth}}
  {\endvarwidth\endcenter}

\setmainfont[Script=Devanagari]{Sanskrit 2003}

\begin{document} 
\bv
तपः स्वाध्यायनिरताम् तपस्वी वाग्विदाम् वरम् |          \\ 
नारदम् परिपप्रच्छ वाल्मीकिर्मुनिपुङ्गवम् ॥ १.१.१ ॥
\ev

\bfns  \blt \hline
तपस्वी & = &  sagacious thinker & वाल्मीकि: & = &  Sage [Poet]  वाल्मीकि & तपः & = & in thoughtful-meditation  \\
स्व अध्याय & = & in self, study of scriptures & निरतम् & = & always - who is eternally studious in scriptures& वाक् & = & in speaking [in enunciation] \\

विदाम् & = & among expert enunciators & वरम् & = & sublime one - with नारद & मुनि पुन्गवम् & = & with sage, paragon, with such a paragon sage Naarada \\

नारदम्  & = &  with [such a sage] Naarada & परिपप्रच्छ  & = &  verily [inquisitively,] inquired about

\elt \efns

सर्व गुण समिष्टि रूपम् पुरुषम्  all, merited endowments, composite, in form - about such a man.] 
A thoughtful-meditator, an eternally studious sage in scriptures about the Truth and Untruth, a sagacious thinker, and a sublime enunciator among all expert enunciators is नारद, and with such a Divine Sage नारद, the Sage-Poet वाल्मीकि is inquisitively enquiring about a man who is a composite for all merited endowments in his form and calibre. [1-1-1]\\  
\end{document}

This is the command I ran: htxelatex bala.tex # generate html. Now I am getting the following error:

(/usr/local/texlive/2011/texmf-dist/tex/generic/tex4ht/html4-math.4ht))
(./bala.aux)

! LaTeX Error: Command `\acute' already defined in `'.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.33 \begin{document}

?

I changed the package options of fontspec to \usepackage[no-math]{fontspec} and it gave me this error:

----------------------------------------------------
(--- xdv font = [/usr/local/texlive/2011/texmf-dist/fonts/opentype/public/lm/lmmono10-regular.otf] (not implemented) ---)
--- warning --- Couldn't find font `[/usr/local/texlive/2011/texmf-dist/fonts/opentype/public/lm/lmmono10-regular.otf].htf' (char codes: 0--255)
(--- xdv font = Sanskrit2003 (not implemented) ---)
--- warning --- Couldn't find font `Sanskrit2003.htf' (char codes: 0--255)
(--- xdv font = Sanskrit2003 (not implemented) ---)
--- error --- Illegal storage address
----------------------------

t4ht.c (2010-12-16-08:47 kpathsea)
t4ht -.xdv 
  -f/Bala 
(/usr/local/texlive/2011/texmf-dist/tex4ht/base/unix/tex4ht.env)
Entering Bala.lg

Best Answer

I know that you finally have found a solution, but there is solution for tex4ht, if anybody is interested.

There are three problems:

  1. You run htxelatex bala.tex # generate html - there should be not # generate html, this is just some comment. Correct command is htlatex bala.tex "xhtml, charset=utf-8" " -cunihtf -utf8" in the case of unicode input. It seems that support for xetex is limited, so it's best to run htlatex and not htxelatex!
  2. tex4ht is incompatible with some xelatex packages, in this case xlxtra, fontspec and polyglossia, so you must edit your preamble to not include them if tex4ht is running.
  3. This problem is most interesting - if you run your file either with htlatex, htxelatex or htlualatex, the devanagari characters are missing - I don't know the right solution, but you can use inputenc's DeclareUnicodeCharacter to enter directly character codes with the help of tex4ht macro \HChar:

    \DeclareUnicodeCharacter{0904}{\HChar{224}\HChar{164}\HChar{132}} % ऄ
    

    You can find all codes in file devng4ht.sty

So now your preamble should look like this:

\documentclass[a4paper,12pt]{article}
\usepackage{etoolbox}
\makeatletter
\@ifpackageloaded{tex4ht}{
\usepackage[utf8]{inputenc}
\usepackage{devng4ht}
}{%
\usepackage{fontspec}
\usepackage{xltxtra}
\setmainfont[Script=Devanagari]{Sanskrit 2003}
\usepackage{polyglossia}
}

\newcommand{\bv}{\begin{Verse}}
\newcommand{\ev}{\end{Verse}}

\usepackage{longtable}
\usepackage[doublespacing]{setspace}
\usepackage{verse}
\usepackage{varwidth}

\AtBeginEnvironment{Verse}{\singlespacing}

\AtBeginEnvironment{longtable}{\onehalfspacing}

\newcommand{\blt}{\begin{longtable}{p{1.5cm}p{0.05cm}p{3.5cm}|p{1.5cm}p{0.05cm}p{3.5cm}|p{1.5cm}p{0.05cm}p{3.5cm}}}
\newcommand{\elt}{\end{longtable}}

\newcommand{\bfns}{\begin{footnotesize}}
\newcommand{\efns}{\end{footnotesize}}

\newcommand{\sz}{\\[\normalbaselineskip]}
\newcommand{\csec}[1]{\section{#1}}

\newenvironment{Verse}
  {\center\varwidth{\linewidth}}
  {\endvarwidth\endcenter}

There is some limitation - you cannot use fot changing switches like large or \footnotesize - from some reason, they will break correct rendering of devanagari.

But you can use private configuration file to adjust your macros using html tags and css codes. File devng.cfg:

\Preamble{xhtml, charset=utf-8}
\Css{
  .verse{
    font-size:large;
    margin:2em;
    width:auto;
  }
  .bfns{
    font-size:small;
  }
}
\begin{document}
\let\old@bv\bv
\renewcommand\bv{\leavevmode\EndP\NoFonts\HtmlParOff\HCode{\Hnewline<div class="verse">\Hnewline}\old@bv}
\let\old@ev\ev
\renewcommand\ev{\NoFonts\HCode{\Hnewline</div>\Hnewline}\old@ev\HtmlParOn\EndNoFonts}
\let\old@bfns\bfns
\renewcommand\bfns{\leavevmode\SaveEndP\NoFonts\Tg<div class="bfns">\old@bfns}
\let\old@efns\efns
\renewcommand\efns{\NoFonts\Tg</div>\old@efns\EndNoFonts}
\EndPreamble

I use \NoFonts to suppress processing of character codes which causes disappearing of devanagari characters. Now you can finally compile your file:

htlatex bala.tex devng " -cunihtf -utf8"

devnagari sample