The solution is to add the option literate={\ \ }{{\ }}1
in your \lstset
.
This way you are declaring to substitute each occurrence of two spaces with one space, and you don't have the need to modify your files.
MWE:
\documentclass[10pt,a4paper,oneside]{article}
\usepackage[latin1]{inputenc}
\usepackage[english]{babel}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage[T1]{fontenc} %use different encoding (copy from pdf is now possible}
\usepackage{fullpage} %small margins
\usepackage{color}
\definecolor{light-gray}{gray}{0.85}
\usepackage{listings} %sourcecode
\lstset{
numbers=left,
breaklines=true,
backgroundcolor=\color{light-gray},
tabsize=2,
basicstyle=\ttfamily,
literate={\ \ }{{\ }}1
}
\begin{document}
\section{With 4 leading spaces}
\lstinputlisting{code1.mcf}
\section{With `real' tabs}
\lstinputlisting{code2.mcf}
\end{document}
Output:
(it seems this works everywhere apart from acrobat reader)
This is based on the example by @DavidCarlisle.
The cmtt
visible space character seems to be labelled differently in different cmtt
variants. For cm-super (which is loaded here when I use \usepackage[T1]{fontenc}
), the respective character is named uni2423
which seems to cause problems with evince
when copying that character.
So I rigorously defined everything which looks like space to a non-break space.
You might want to restrict this to verbatim ;-)
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{color}
\input{glyphtounicode}
\pdfglyphtounicode{visiblespace}{A0}
\pdfglyphtounicode{blank}{A0}
\pdfglyphtounicode{visualspace}{A0}
\pdfglyphtounicode{uni2423}{A0}
\pdfgentounicode=1
\begin{document}\showoutput
\makeatletter
\def\@xobeysp{\textcolor{white}{\char32}}
\makeatother
\begin{verbatim}
def myfunction(x):
return x
\end{verbatim}
\end{document}
I am inclined to consider the fact that apparently no (consecutive or beginning-of-line) spaces can be copied from Acrobat a bug.
Or is this specified anywhere?
At least it's completely the same with official Adobe documents like the PDF Reference.
So I consider this answer valid no matter what :-)
Best Answer
There are other invisible characters than the space character (ASCII code 32); one of them is the tabulator, a.k.a. tab character (ASCII code 9), which are used in some types of source files; most notably, perhaps, in makefiles. For more information about invisible characters in general, I refer you to the Wikipedia page on ASCII control characters.
The
listings
package treats space characters and tabulators very differently, as explained in subsection 2.5 of thelistings
documentation.If you don't want any surprises, and if your code could do with just spaces and no tabulators, sticking to spaces and eschewing tabulators entirely is probably a good idea.
I'm guessing you're using a LaTeX IDE, such as texmaker, as opposed to a text editor, such as emacs or vim. Your IDE most likely inserts a tab character when the Tab key is pressed by default, but there should be a way to configure it to insert a fixed number of spaces (typically, 2,4, or 8) instead. Such an option may not be retroactive, though; in other words, tab characters that were already present in your file may not be replaced by spaces simply by activating that option. You may have to search and replace tabs by spaces to get rid of all of them. This arguably is error-prone but you may not have any other option, if you use an IDE. However...
Edit (following Barbara Beeton's comment): ... editors such as emacs and vim have a built-in function to efficiently replace all tabs by spaces in the current buffer. In emacs, you have
untabify
; in vim, you haveretab
(see this).Good text editors also allow you to make invisible characters (tabulators, newlines, etc.) visible, so you know what type of invisible character you're actually dealing with. In the picture above (screenshot of a vim buffer), the
▸
symbol denotes a tabulator and the¬
symbol denotes a newline.That's two more reasons to leave your IDE behind and start using a text editor
:)
Update: texmaker 4.2 (2014/05/01) now allows advanced users to extend the IDE's functionalities by running custom Javascript code from within. This opens the door to a "retab" script, but assumes you know a modicum of Javascript.