What your best option would be depends on a lot on what your needs are. Are you only trying to import the structure, or exact look, or what? How important is it that the resulting document really be done properly?
Anyway, here are a number of things to try.
AbiWord: an open source word processor that can import HTML or similar formats and export LaTeX. (Be sure to install the extra export plugins when installing; the default install doesn't include a LaTeX export, but it can easily be chosen.)
Writer2LaTeX: An openoffice plugin for exporting to LaTeX; Open office supports HTML import of course (Though W2L can handle .odt to .tex even without Open Office installed; but then converting .html to .odt might be trickier.)
rtf2latex2e: as its name implies, converts RTF to LaTeX; so you'd need some way to convert HTML to RTF (though that's relatively easy, can be done with most any word processor).
pandoc: Haskell program for converting between various mark-up languages, including HTML and LaTeX
html2latex: Perl script for such conversions (I've never tried it but plan on doing so soon)
htmltolatex Java program along similar lines (Again, I haven't tried it.)
Even with all those options, however, personally, if it was something I truly cared about doing right, simply transferring over the plain text and redoing everything manually would still be my solution of choice. The above are just quick fixes for a document of relatively little importance, or when having it in LaTeX in addition to HTML is just a matter of convenience.
I'm afraid there's no easy solution. As Marco stated, a CV requires attention and better control. I'll present a solution I use, though it's not totally LaTeX based. Here comes sphinx
.
According to the website, sphinx
is a tool that makes it easy to create intelligent and beautiful documentation, written by Georg Brandl and licensed under the BSD license. It requires Python.
sphinx
is mainly used for documentation, but it's generic enough to be used everywhere.
It's a very straighforward process. Let's say I want to create an online CV for John Doe. I simply run sphinx-quickstart
and answer a few questions. After running it, you will have index.rst
- we put the content here - and conf.py
- the configuration file. sphinx
also creates both Makefile
and make.bat
for generating the outputs we want.
The rst
format stands for reStructuredText, plain text markup syntax, very similar to Markdown. You will see, there's no secret.
Now, I'll open my index.rst
file and type the following content:
.. My Curriculum Vitae documentation master file, created by
sphinx-quickstart on Wed Nov 30 11:04:16 2011.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
John Doe
========
:Currently: PhD in Cat Physics
Publications
------------
* John Doe. *How to make a cat levitate*. In *Cat Symposium 2011*,
pages 110--125. CS, 2011.
* John Doe, Jin Doe. *Violence against lolcats*. In *Cat Symposium 2009*,
pages 98--101. CS, 2009.
Reports
-------
* 2008-2009: Growth of cats around the world. *Tech. report*.
* 2006-2007: Cats are dangerous? *Tech. report*.
Contributions
-------------
* `GCat <http://www.google.com/>`_ : a Google-based browser for cats.
Contact
-------
| John Doe
| john.doe@catsociety.com
.. toctree::
:hidden:
That's it, plain and simple. Now I just need to run make html
. The output:
If you want to change the theme, there are some predefined ones, say, nature
. Open conf.py
and find the following line:
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'default'
Simply replace default
by nature
. The new output:
Note: You can tweak all elements of the page, e.g. removing the search box, but I can't remember right now. :)
sphinx
can also generate a tex
file. Go with make latex
and the .tex
file (and other styles) will be generated:
OK, LaTeX output doesn't look so great. :)
As I said, the most common use of sphinx
is to generate documentation, but we can easily tweak our tex
file to look more pleasant to the eye.
I've seen entire sites written with sphinx
. You can create great looking HTML pages with ease. Use one of the predefined themes or come up with our own. :)
Best Answer
Using PDF as an intermediate format when converting from LaTeX to HTML is not a very good idea. LaTeX and HTML are both mostly structural markup languages, which means you use them to describe the document structure (sections, emphasize, formulas etc.), whereas PDF is mostly about the representation of your document on the screen or paper. When converting LaTeX to PDF, you lose much of the structural information, and it cannot be successfully recovered by conversion from PDF to HTML.
It is much better to convert LaTeX directly to HTML. There are number of ways (WayBack Archive) how to do that, one I would recommend is by using
htlatex
. It is probably already part of your TeX distribution, is very powerful and flexible, and its use can be as simple as runningIf you tell us more about your environment (which operating system do you use, what is your TeX distribution, your text editor/LaTeX IDE, how you generated the PDF file etc.) we may be able to give you more details on how to use
htlatex
.