[Tex/LaTex] UTF-8 (BMP character set) support in listings.

listingsunicode

I'm submitting a paper to a journal using LaTeX, and I need to be able to write Unicode symbols into listings. So far, I've been able to get by with the moreverb package. It's listing environment is pretty good in conjunction with the utf8x set on inputenc. However, some symbols, like Σ, just won't work.

Note: I'm using listings because the contents are code-snippets and not traditional math expressions. Here are some minimal examples.

\documentclass[preprint, 11pt]{sigplanconf}
\usepackage[utf8x]{inputenc}
\usepackage{amsmath}
\usepackage{verbatim}
\usepackage{moreverb}
\usepackage{listings}

This works:

\begin{listing}{1}
test = 2 ∈ s
\end{listing}

This doesn't work:

\begin{listing}{1}
sum = Σ(s)
\end{listing}

Please help!

Best Answer

Your problem is not really listing related; you’ll get the error outside of a listing as well. The point is that Latex simply does not understand how to display the unicode characters.

With math characters, this would be easy to solve, however, because of the moreverb environment, you cannot use math directly inside the listing. Luckily, the inputenc package gives you a way out. You need to explicitly tell Latex, how it should render unknown utf8 characters.

\documentclass[11pt]{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath}
\usepackage{verbatim}
\usepackage{moreverb}

\DeclareUnicodeCharacter{03A3}{\ensuremath{\Sigma}}
\DeclareUnicodeCharacter{2208}{\ensuremath{\in}}

%% or simply use the newunicodechar package
%% which does the same without the need of remembering numbers:
% \usepackage{newunicodechar}
% \newunicodechar{Σ}{\ensuremath{\Sigma}}
% \newunicodechar{∈}{\ensuremath{\in}}

\begin{document}
∈ Σ

\begin{listing}{1}
test = 2 ∈ s
\end{listing}

\begin{listing}{1}
sum = Σ(s)
\end{listing}

\end{document}

Edit: Changed to use \ensuremath instead of $..$

Edit 2: Added code for the newunicodechar package

Note: You are not restricted to use math mode. You may as well change to another font which includes the needed character somewhere and use it directly.