One way to get around this limitation of listings
is to use the option extendedchars=true
and then to use the literate
option for each accents you're going to be using (it's a bit tedious to do, but once you've done all the accents of your language, you never have to worry about them again). The syntax is
literate={á}{{\'a}}1 {ã}{{\~a}}1 {é}{{\'e}}1
For each accent you must put the real character inside braces (e.g. {á}
) then you put what you want this character to be inside double braces (e.g. {{\'a}}
) and finally you put the number one (1
); between two entries, you can put a space for clarity.
Here's your example modified to use this:
\documentclass[12pt,a4paper]{scrbook}
\KOMAoptions{twoside=false,open=any,chapterprefix=on,parskip=full,fontsize=14pt}
\usepackage[portuguese]{babel}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{inconsolata}
\lstset{
language=bash, %% Troque para PHP, C, Java, etc... bash é o padrão
basicstyle=\ttfamily\small,
numberstyle=\footnotesize,
numbers=left,
backgroundcolor=\color{gray!10},
frame=single,
tabsize=2,
rulecolor=\color{black!30},
title=\lstname,
escapeinside={\%*}{*)},
breaklines=true,
breakatwhitespace=true,
framextopmargin=2pt,
framexbottommargin=2pt,
inputencoding=utf8,
extendedchars=true,
literate={á}{{\'a}}1 {ã}{{\~a}}1 {é}{{\'e}}1,
}
\begin{document}
\begin{lstlisting}
<?php
echo 'Olá mundo!';
print 'áãé';
\end{lstlisting}
\end{document}
it is a latin1 or latin9 encoding , same as ISO 8859-1 or 15. With command recode
or iconv
you can change it:
recode --diacritics --touch --verbose latin1..UTF-8 <file>
or
iconv -f LATIN1 -t UTF-8 inputfile.tex > outputfile.tex
Best Answer
If you want to know how the 8-bit engines handle utf8 input you can use \tracingmacros:
which gives
That means the the first byte of the
ä
(theÃ
) is an active char, a command which then picks up the next byte and then calls\u8:ä
which calls\"a
. In this way (pdf)latex can handle quite a lot of utf8 input but it has e.g. problems with "char + combining accent" as there is no sensible code for the combining accent to go back to add an accent on the char.