[Tex/LaTex] Xelatex conversion to .doc/.odt/.rtf/.html after running biblatex


This question is similar to many others posted on these forums, but the answers I have seen have not solved my problem (e.g. at Converting Latex to MsWord .doc or .rtf). Please forgive me if this is repetitive, and point me in the direction of its solution. I will be deeply indebted to anyone who has ideas.

In brief, I wrote my dissertation in XeLaTeX largely because of the flexibility of biblatex. I work in a field wherein I have to cite a lot of East Asian source materials, and getting other reference managers to output these sorts of references correctly has been difficult. Biblatex works beautifully, and I'd be sorry to give it up.

However, now that I am on the other side of my dissertation, I need to submit papers (and manuscripts) for publication. In my backwards field, such papers very often need to be submitted as a Word document.

I need now to consider whether I can keep Xelatex, biblatex, and the large bibliography file I created during my dissertation as part of my workflow–even if the end product sometimes has to be a Word-readable and Word-editable document. I know of several programs that can convert an uncompiled Xelatex document to other formats (pandoc, etc.). And I know that tex4ht can convert some LaTeX documents to html or odt after running biber. But for reasons I can't understand (that may have to do with incompatibilities between Tex4ht and xelatex?), it does not seem to work on my stuff. (I am working on Windows 10 with MikTex.)

The documents I need to be able to produce are fairly simple (although cjk bibliography requires some definitions to be able to work). I don't have graphs, charts, or images to include: just Chinese and Japanese texts.

Here is an extremely minimal example that I cannot get to work. This is the latex file:

\setmainfont[Ligatures={Common, TeX}]{Times New Roman}


    Zhupo shihua 竹坡詩話


Of course, this extremely minimal code compiles perfectly into a pdf with Xelatex. When I then go to run htlatex on it (i.e. in the command prompt, I type: htlatex test.tex), I get the following error message:

C:\Users\lbxxx>latex  \makeatletter\def\HCode{\futurelet\HCode\HChar}\def\HChar{\ifx"\HCode\def\HCode"##1"{\Link##1}\expandafter\HCode\else\expandafter\Link\fi}\def\Link#1.a.b.c.{\g@addto@macro\@documentclasshook{\RequirePackage[#1,html]{tex4ht}}\let\HCode\documentstyle\def\documentstyle{\let\documentstyle\HCode\expandafter\def\csname tex4ht\endcsname{#1,html}\def\HCode####1{\documentstyle[tex4ht,}\@ifnextchar[{\HCode}{\documentstyle[tex4ht]}}}\makeatother\HCode .a.b.c.\input  "C:\Users\lbxxx\Desktop\Tex Testing\onlinetest.tex"
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (MiKTeX 2.9.6300 64-bit)
entering extended mode
LaTeX2e <2017-04-15>
Babel <3.9r> and hyphenation patterns for 72 language(s) loaded.
! Undefined control sequence.
<*> ...}\makeatother\HCode .a.b.c.\input  C:\Users
                                                  \lbxxx\Desktop\Tex Testing...


I can hit r several times and nothing at all comes out. Same basic thing for htxelatex.

Of course, the actual files I would like to be able to convert this way are much more complicated. Here would be a minimal example, this time including footnotes and bibliography (many thanks to moewe for much of this code):

\setmainfont[Ligatures={Common, TeX}]{Times New Roman}


\ProvidesFile{chicago-notes.dbx}[2016/07/24 extended name format for biblatex]
  title={A Nation-State by Construction: Dynamics of Modern Chinese Nationalism},
  author={given=Suisheng, family=Zhao, cjk=趙歲升},
  address = {Stanford},
  publisher={Stanford University Press}}
    author = {family=Zhou, cjk=周紫芝, given=Zizhi},
    title = {Zhupo shihua},
    titleaddon = {竹坡詩話},
    series = {Yinying Wenyuange Siku quanshu edition},
    year = {1985},
    address = {Taibei},
    publisher = {Taiwan shangwu yinshu guan}
  author={Smith, Junior, Jim},
  address = {Stanford},
  publisher={Stanford University Press}}

\usepackage[notes,strict,annotation,cmsdate=both,isbn=false, backend=biber]{biblatex-chicago}

% Based on definitions from biblatex.def




                     test {\ifdefvoid\namepartgiven}
                     test {\ifdefvoid\namepartprefix}}









So, in short, is there any way to be able to continue to use biblatex with my cjk-materials but output to .html, .odt, .rtf, or .doc? I am not particular about the formatting of the final document, except that the text needs to be there and footnotes need to be footnotes. Please do not suggest converting the final pdf to word–I'd rather type in all my bibliographic material by hand than have to deal with all the headaches that causes.

Anyone who can solve this will have my eternal gratitude.

Best Answer

Your main issue is that you can't include the TeX file in the compilation. I can't reproduce this issue, I guess that it is some issue with spaces in your file path.

Anyway, once you manage to actually compile your document, you will face some issues. There are some minor issues with biblatex package and big issue with xeCJK package, which causes tex4ht to fail. Both of these issues can be fixed easily with some custom configurations.

Save the following files to your document's directory:


% usepackage.4ht (2017-01-31-15:40), generated from tex4ht-4ht.tex
% Copyright 2003-2009 Eitan M. Gurari
% Copyright 2009-2017 TeX Users Group
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
%   http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
% This work has the LPPL maintenance status "maintained".
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
% If you modify this program, changing the
% version identification would be appreciated.
   \def\:temp{tex4ht}\ifx \:temp\@currname
   \:warning{\string\usepackage{tex4ht} again?}
   \def\:temp#1htex4ht.def,tex4ht.sty#2!*?: {\def\:temp{#2}}
\expandafter\:temp \@filelist htex4ht.def,tex4ht.sty!*?: %
\ifx \:temp\empty  \else
    \string\RequirePackage[tex4ht]{hyperref} or
    \string\usepackage[tex4ht]{hyperref} was
    used try instead, repectively,
    \string\RequirePackage{hyperref} or

\gdef\a:usepackage{\use:package xr,xr-hyper,savetrees,fontspec,xeCJK,biblatex,,!*?: }
   \if :#1:\def\:temp##1!*?: {}\else
      \def\:temp{#1}\ifx \@currname\:temp
             \def\:temp##1!*?: {\input usepackage.4ht  }%
      \else \let\:temp=\use:package \fi
   \fi \:temp}
\def\:temp{xr}\ifx \@currname\:temp

\def\:temp{xr-hyper}\ifx \@currname\:temp

\def\:temp{savetrees}\ifx \@currname\:temp
\def\:temp{fontspec}\ifx \@currname\:temp
    \input usepackage-fontspec.4ht

\def\:temp{xeCJK}\ifx \@currname\:temp
%\input tuenc-xetex.4ht


\def\:temp{biblatex}\ifx \@currname\:temp


and configuration file for biblatex, biblatex.4ht:

% biblatex.4ht (2016-03-16-10:08), generated from tex4ht-4ht.tex
% Copyright 2007-2009 Eitan M. Gurari
% Copyright 2009-2016 TeX Users Group
% This work may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either
% version 1.3c of this license or (at your option) any
% later version. The latest version of this license is in
%   http://www.latex-project.org/lppl.txt
% and version 1.3c or later is part of all distributions
% of LaTeX version 2005/12/01 or later.
% This work has the LPPL maintenance status "maintained".
% The Current Maintainer of this work
% is the TeX4ht Project <http://tug.org/tex4ht>.
% If you modify this program, changing the
% version identification would be appreciated.
\immediate\write-1{version 2016-03-16-10:08}

   \ifdim\abx@version pt< 3pt \xdef\blx:ver:no{2}\else\xdef\blx:ver:no{3}\fi
         {\gHAdvance\shorthands:cnt by 1\relax
          \ifnum \shorthands:cnt=1 \a:printshorthands
          \else                    \c:printshorthands \fi
   \csname a:@shorthands\endcsname}
\def\a:entryhead:full{CV Radhakrishnan}
      {\IgnorePar\EndP \gHAdvance\bib:N by 1
       \HCode{<text:bibliography text:name="bib-\bib:N" >
           <text:index-entry-span>: </text:index-entry-span>\Hnewline
            text:bibliography-data-field="author" />\Hnewline
           <text:index-entry-span>, </text:index-entry-span>\Hnewline
            text:bibliography-data-field="title" />\Hnewline
           <text:index-entry-span>, </text:index-entry-span>\Hnewline
            text:bibliography-data-field="year" />\Hnewline
        \HCode{<text:p text:style-name="p-bibitem">}%
        \gHAdvance\bibN by 1
          text:name="X0-\csname BIB-\bibN\endcsname">%

  {\ifvmode \IgnorePar \fi \EndP \EndP
    \HCode {<dl class="thebibliography">}%
% This is for linking citations with biblist items which
% are in a different file when output is split into different
% chunks. [CVR 2012-09-27]
% <biblatex-2.2>
% </biblatex-2.2>
    \PushMacro \end:itm \global \let \end:itm =\empty}%
  {\ifvmode \IgnorePar \fi \EndP
    \PopMacro \end:itm \global \let \end:itm \end:itm \EndP
    \HCode {</dd></dl>}\ShowPar}%
  {\ifvmode \IgnorePar \fi \EndP \gHAdvance \bibN by 1
    \end:itm \global \def \end:itm {\EndP \Tg </dd>}%
    \HCode {<dt id="X\therefsection-\abx@field@entrykey"
      class="thebibliography">}\bgroup \bf}%
  {\ifvmode \IgnorePar \fi \EndP
    \HCode {</dt><dd\Hnewline id="bib-\bibN"
    \par \ShowPar}%

% \def\blx@checksum{\ifx \blx@checksum@old \blx@checksum@new \else
%   \blx@warning@noline {Page references have changed.\MessageBreak
%     Rerun to get references right}\@tempswatrue \blx@reruntrue \fi
%   \@nameuse {blx@rerun}}
     \csname onthebibliography:list\endcsname
\ifnum\blx:ver:no < 3
  \global\advance\c@bib 1
    \string\csname\space BIB-\thebib\string\endcsname
  \edef\blx@bbl@data{blx@data@\the\c@refsection @\abx@field@entrykey}%
% Biblatex 3.0
% Hacks for biblatex
% MakeUppercase is redefined by tex4ht, biblatex tries to redefine it as well, but it relies on original
% LaTeX version:
% Same applies also for \MakeLowercase

% I don't really understand this, but language processing is broken by default
% with biblatex. It loads language file, but it executes code which should be
% executed only in the case if the language file fails, it displays an error message
% and language handling doesn't work. When we execute following code, the language
% files are loaded before checking of the success and it seems to work.

      {% This is required for languages which are never explicitly selected

% Following macros doesn't seem to work with biblatex 3.4. We should make another test for
% biblatex > 3.0 and < 3.3
\ifdim\abx@version pt < 3.3pt
  \edef\blx@bbl@data{blx@data@\the\c@refsection @\blx@slist@scheme
  \ifcsundef{blx@pref@\the\c@refsection @\abx@field@entrykey}
    {\listcsxadd{blx@slists@\the\c@refsection @entry@\blx@slist@scheme}%
    {\listcsxadd{blx@slistsbib@\the\c@refsection @entry@\blx@slist@scheme}%
     \listcsxadd{blx@type@\the\c@refsection @\abx@field@entrytype}%
\fi % end of version boolean
   \expandafter\ifx \csname a:printfield-#2\endcsname\relax
       {\csname a:printfield-#2\endcsname}%
       {\csname b:printfield-#2\endcsname}%
   \csname o:\string\blx@printfield:\endcsname[#1]{#2}%
   \expandafter\ifx \csname a:bibstring-#2\endcsname\relax
       {\csname a:bibstring-#2\endcsname}%
       {\csname b:bibstring-#2\endcsname}%
   \csname o:\string\blx@bibstring:\endcsname[#1]{#2}%
   \expandafter\ifx \csname a:bibcpstring-#2\endcsname\relax
       {\csname a:bibcpstring-#2\endcsname}%
       {\csname b:bibcpstring-#2\endcsname}%
   \csname o:\string\blx@bibcpstring:\endcsname[#1]{#2}%
   \expandafter\ifx \csname a:biblcstring-#2\endcsname\relax
       {\csname a:biblcstring-#2\endcsname}%
       {\csname b:biblcstring-#2\endcsname}%
   \csname o:\string\blx@biblcstring:\endcsname[#1]{#2}%
   \expandafter\ifx \csname a:bibucstring-#2\endcsname\relax
       {\csname a:bibucstring-#2\endcsname}%
       {\csname b:bibucstring-#2\endcsname}%
   \csname o:\string\blx@bibucstring:\endcsname[#1]{#2}%
   \ifx \:temp\blx@cbxfile


   \ifx \UnDef\biblatex:style
         not available}%
  \ifhmode \spacefactor\blx@sf@par\fi
\blx@unitmark=10pt plus 1pt minus 1pt
% <Kristian.Debrabant@cs.kuleuven.be> reported that After updating
% biblatex and biblatex.ht to versions 2.2 respectively
% 2012-09-28-17:49 (using MiKTeX 2.9 64 bit), tex4ht seemed no longer
% respected the defernumbers option in biblatex.sty: When applied to
% the attached file tex4hterror.tex.
% The problem was due to nullifying \abx@aux@number which in fact
% should have been redefined to \blx@aux@number when defernumbers
% option is true.
% This is done now and as per Kristian, the fix works fine now.
\ifnum\blx:ver:no < 3
     %\blx@addchecksum{#1}{#4} % this can cause a nodocument error!

\fi % end of version boolean
  \csname a:blx@unit\endcsname
  \csname b:blx@unit\endcsname



  \xifinlist{X\the\c@refsection -%@
    {\listxadd\blx@anchors{X\the\c@refsection -%@
     \hyper:natanchorstart{X\the\c@refsection -%@


   \hyper:natlinkstart{X\the\c@refsection -%@

   \hyper:natlinkstart{X\the\c@refsection -%:

   \hyper:natanchorstart{X\the\c@refsection -%:


% Oleg Domanov odomanov@yandex.ru reports:
% tex4ht ends with an error when compiles biblatex files. I'm on
% Windows, texlive 2012. I put here a minimal example and files
% generated with the command latexmk test && mk4ht oolatex test
% https://www.dropbox.com/s/hn1zm40htqs13mf/t4htlink.zip
% There is a superfluous \relax in the file test.tmp, line 65 which
% seems to cause the error.
% Changes to cope with biblatex upgrade caused this problem. It is now
% fixed. --CVR 2012/10/26
      \expandafter\ifx\csname QXpage.\thepage\endcsname\relax%
        \HCode{<a id="page.\thepage"></a>}%
        \expandafter\xdef\csname QXpage.\thepage\endcsname{0}%
      \Link[\csname BibFileName\therefsection\endcsname]{}{#1}\EndLink}
      \expandafter\ifx\csname QXpage.\thepage\endcsname\relax%
        \HCode{<a id="page.\thepage"></a>}%
        \expandafter\xdef\csname QXpage.\thepage\endcsname{0}%
      \Link[\csname BibFileName\therefsection\endcsname]{#1}{}}

\ifx\blx:ver:no < 3
% biblatex 2.9a
% Newly added to process {keylist} environment (CVR)
% biblatex 3.0
\fi  % End of version boolean

   {\EndP\HCode{<dl \a:LRdir class="description">}%
   {\PopMacro\end:itm \global\let\end:itm \end:itm
   {\end:itm \global\def\end:itm{\EndP\Tg</dd>}\HCode{<dt
        class="description">}\bgroup \bf}%
   {\egroup\EndP\HCode{</dt><dd\Hnewline class="description">}}

    \@footnotetext,%          latex
    \H@@footnotetext,%        hyperref
    \scr@saved@footnotetext,% koma-script 3.x
    \l@dold@footnotetext,%    ledmac
    \l@doldold@footnotetext,% ledmac
    \@fntORI}%                frenchle


Now, instead of htxelatex, I would use make4ht. It is much more flexible. You will need a build file to enable biber compilation and to enable both odt and html output. Save the following file as mybuild.mk4:

Make:add("biber", "biber ${input}")

if mode:match "odt" then
  settings.tex4ht_sty_par = settings.tex4ht_sty_par .. ",ooffice" 
  settings.t4ht_par = settings.t4ht_par  .. " -coo -cooxtpipes"

if mode:match "draft" then
  Make:htlatex {}
  Make:htlatex {}
  Make:biber {}
  Make:htlatex {}

With make4ht, you can use --mode or -m command line option. This selects which compiling sequence will be used. You will also need to use -u for Unicode output and -x for XeLaTeX. Basic compilation would be thus:

make4ht -ux -e mybuild.mk4 filename.tex

This will compile the document as HTML and it will run biber after first LaTeX compilation. On subsequent compilations, when the bibliography doesn't change and it is thus unnecessary to run biber, you can use draft mode:

make4ht -uxm draft -e mybuild.mk4 filename.tex

To get an ODT document, you can add odt to the draft option:

make4ht -uxm odt-draft -e mybuild.mk4 filename.tex

This is the resulting HTML:

enter image description here


It seems that your distribution is missing some files distributed with tex4ht, so you may also need this one, usepackage-fontspec.4ht. It seems that there is a limit on how many characters post can contain, so I had to upload it to Gist.

Related Question