[Tex/LaTex] No simple UTF8 support in latex

input-encodingslatex-miscunicode

The following should look good in your browser, AND compile in Latex looking beautiful right?

\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
UTF-8 test:
\begin{verbatim}
logic:  not:¬
        and: ∧  nand:⊼
        or: ∨   nor:⊽
        inequal to, xor:≢,≠,⊻,⊕     equal to: ≡,=
Set logic:  ∪,∩,⊊,⊋,⊆,⊇,⊈,⊉,≡ 
    avoid useage of non-specific: ⊂,⊃,⊄,⊅
Set Membership:∈,∉,∌,∋,∅
assignment ≔≕
predicate logic:∃,∄,∀,∴,∵
regex: /|\.⋆^?+(){,}[]
Arrows: ↖↑↗⇖⇑⇗
        ←↔→⇐⇔⇒⇕↕
        ↙↓↘⇙⇓⇘
        ↚↮↛⇍⇎⇏
numbers:ℕatural={0,1,2,3, … ∞}, ℤintegers,
        ℚ=rational, ¬ℚ=irrational,
        ℝeal, ±∞, ℂomplex
size: ≤,≥,<,>,≮,≯,≰,≱,≪,≫,≢,≠,≡,=
Greek Alphabet:
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣ ΤΥΦΧΨΩ
αβγδεζηθικλμνξοπρσςτυφχψω
Game:
Chess symbols:♚♛♜♝♞♟♔♕♖♗♘♙
Playing card symbols:♠♣♥♦ 23456789JQKA
miscellaneous:°∙■□▪▫○●☇
\end{verbatim}
\end{document}

Currently(2010 Dec 12 with latest ubuntu with all updated packages) I get the following error:

! Package inputenc Error: Unicode char \u8:**"insert random char here"** not set up for use with LaTeX.

Best Answer

The original TeX engine wasn't designed to handle multi-byte encodings. So any package that provides Unicode functionality on top of that has to be imperfect. There are however two new engines that do provide full Unicode support: LuaTeX and XeTeX. There is also a package that makes it easy to select any OpenType or TrueType font for use with these engines: fontspec. (There is also unicode-math which provides support for Unicode-aware math fonts like STIX (or XITS) and Cambria Math, but it is still in heavy development and not quite complete yet.)

Be aware that the version of these engines and packages shipped with Ubuntu 10.10 (and earlier) are very outdated. If you want to use them, you have to manually install TeX Live 2010 (which is easy).

After you have updated, you can use lualatex or xelatex to compile the following code:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{FreeSerif}
\setmonofont{FreeMono}

\begin{document}
\begin{verbatim}
% your example here
\end{verbatim}
\end{document}

This sets FreeSerif as the main font (Free Serif isn't the most beautiful font, but I do not know any other free font that rivals it in Unicode coverage; it comes preinstalled with Ubuntu -- in fact your browser probably uses it to display the symbols) and FreeMono as the monospace font (Free Mono lacks the chess symbols and diagonal implication arrows, but provides everything else in your example; Free Serif would provide everything but isn't monospace). Another free font family with good Unicode coverage is DejaVu, which as based on Bitstream Vera. The compiled document looks as follows:

example

You could of course replace \setmonofont{FreeMono} by \setmonofont{FreeSerif}. Then your verbatim text wouldn't be monospaced anymore, but would contain all symbols. See the fontspec manual for other fancy things that are possible.

If you are writing a text that mixes several languages and scripts then a single font will probably not cover all possible symbols. In this case the polyglossia package will come in handy (it currently only works with XeLaTeX).