[Tex/LaTex] Format a verbatim paragraph

environmentsformattingline-breakingparagraphsverbatim

How can I format a verbatim paragraph? I.e. break, fill and join input lines to produce globally balanced output with the lengths of each line approaching the target \textwidth as closely as possible.

Using the listings package I can break lines, but I cannot join short lines.. That is, the following

\documentclass{article}
\usepackage{listings} 
\lstnewenvironment{exampleA}{\lstset{% 
  language=,
  basicstyle=\ttfamily, 
  breaklines=true,
  prebreak=+,
  postbreak=->,
  columns=fullflexible,
  breakindent=5pt} }{} 
\begin{document}
\begin{exampleA}
Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line 
Short Line 
Short Line 
Short Line 
\end{exampleA}
\end{document}

produces
enter image description here

What I would like to achieve is similar to what LaTeX does by default with paragraphs.. That is, e.g.

\documentclass{article}
\newenvironment{exampleB}{\tt}{}
\begin{document}
\begin{exampleB}
Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line 
Short Line 
Short Line 
Short Line 
\end{exampleB}
\end{document}

will produce:
enter image description here

The problem with using the latter approach is that I cannot use special symbols like # or % in the verbatim text.. So I need to use a verbatim environment like lstlisting or fancyvrb's Verbatim environment. (As mentioned: in the lstlisting environment, automatic line breaking is possible, however for the Verbatim environment this feature seems also to be missing)

EDIT: (This a comment to Frank's answer below)

Now there seems to be a problem with lines becoming too short.. The following input

\noindent\hrulefill
 \begin{myverbatim}
  xxxxxxx xxxx xxxxx: 1234567890123456789012345,
  123, 1234567890,
\end{myverbatim}
\noindent\hrulefill\

now produces

enter image description here

(I would now like to have the word "123" at the end of the first line, as there are certainly plenty of space for it there :))

Update

The above problem has now been solved, see comments in Frank's answer below..

Best Answer

David did beat me by a couple of minutes, but this version here does indentation as requested and is not producing overfull lines (within reason):

\documentclass{article}

\makeatletter

% this defines myverbatim environment. to change name replace "myverbatim" in all places below (strctly speaing it is only necessary in some but ... :-)

\newdimen\outerparindent
\def\myverbatim{%
% fix for \@noligs as the definition in LaTeX is swallowing any following space
  \def\do@noligs##1{%
     \catcode`##1\active
     \begingroup
       \lccode`\~`##1\relax
       \lowercase{\endgroup\def~{\leavevmode\kern\z@\char`##1 }}}%
% save the \parindent used outside
   \outerparindent\parindent
% I'm lazy reusing existing setup of verbatim as much as possible, so obeylines is my way too hook in
   \def\obeylines{\rightskip=0pt plus 1fil
                  \parindent=\outerparindent  % if you like a defined \parindent value instead set it here
                  \let\par\@@par
                  \leavevmode\indent}%
% different definitions to handle spaces, select one:
   \def\@xobeysp{\penalty\z@\char 32 \penalty\z@}%        % this produces a visible space
%   \let\@xobeysp\space                                    % this version will drop spaces at linebreaks
%   \def\@xobeysp{\penalty\z@\mbox{}\space\penalty\z@}%    % this will keep spaces after line breaks
%
  \@verbatim
  \@myverbatimescape\@myverbatimnewline
  \frenchspacing\@vobeyspaces\@xmyverbatim}

\let\endmyverbatim\endverbatim

% setting up the behavior of end-off-line: on its one behave like a (special) space, 
% two in a row end a paragraph
\begingroup
\catcode`\^^M=\active%
\gdef\@myverbatimnewline{\catcode`\^^M=\active \let^^M\@xmyverbatimnewline}%
\gdef\@xmyverbatimnewline{\@ifnextchar ^^M{\@myverbatimpar}{\@xobeysp}}%
\gdef\@ymyverbatimnewline{\@ifnextchar ^^M{\@myverbatimpar}{}}%
\gdef\@myverbatimpar ^^M{\par%
                         \vskip\baselineskip%     % this line will generate an extra baselineskip per empty line
                                                  % comment out if not wanted
                         \@ymyverbatimnewline}%
% and this part is to get rid of the first ^^M after \begin{verbatim} but not any other
\gdef\@zmyverbatimnewline{\@ifnextchar ^^M{\@zmyverbatimpar}{}}%
\gdef\@zmyverbatimpar^^M{\@ifnextchar ^^M{\@myverbatimpar}{}}%
\endgroup

\begingroup \catcode `|=0 \catcode `[= 1
\catcode`]=2 \catcode `\{=12 \catcode `\}=12
%  to support \\ as a line break we need to make \ active and make a lot of extra horrible definitions:
\catcode`\\=13
|gdef|@myverbatimescape[|catcode`|\|active|let\|@myverbatimbslash]
%  using \@ifnextchar is ok because "space" is active too so isn't gobbled in case of  "\ \"
|gdef|@myverbatimbslash[|@ifnextchar\[|@xmyverbatimbslash][|string\]]
|gdef|@xmyverbatimbslash\[|\]
% a newline char before the end of the environment also needs some special handling therefore the gymnastics
|catcode`|^^M=|active
|long|gdef|@xmyverbatim#1\end{myverbatim}[|@zmyverbatimnewline#1^^M|vskip-|lastskip|vskip|z@skip|end[myverbatim]]%
|endgroup

\makeatother


\begin{document}

Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line 
Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line Normal line 

 \begin{myverbatim}
Long line Long line with \ or \ \ is okay but \\now breaks the line Long line Long line Long line Long line Long line Long line Long line Long line Long line Long line 

Long line Long line Long line Long line Long line 
Short Line 

Short Line  # ^&  % 
Short Line 
\end{myverbatim}

\end{document}

As requested in a comment, the code above now supports \\ as a line break while single usage of \ will generate a backslash. The code gets a little messy in that case, but but that is kind of the price to change catcodes around :-)

Output would then be

enter image description here

Update

Initially I had set \rightskip to 0pt plus 4em basically because I forgot that no hyphenation is going on in this type of environment. So with long "words" that could result in overfull lines. So I now changed it above to 0pt plus 1fil so as long as everything is shorter than a full line one should not get any overfull lines.

Of course line breaks only happen at spaces, so in the example given by the OP it will break after the first set of digits as the second already makes the line overfull.

Update 2

As it turned out, the solution above would swallow one space after a ,in the input (in fact after a certain set of characters). That strange behavior is due to what I would claim is a documentation error in the TeXbook (page 44) namely that

\chardef\%=`\%

is simply a more effective way achieve the same as

\def\%{\char`\%}

In fact it isn't. The latter definition will swallow an optional following space the former doesn't! And that generates the issue as the comma in verbatim is active to prevent ligatures and its definition is generated by the following code

\def\do@noligs#1{%
  \catcode`#1\active
  \begingroup
     \lccode`\~`#1\relax
     \lowercase{\endgroup\def~{\leavevmode\kern\z@\char`#1}}}

which means that it expands to

\leavevmode\kern\z@\char`\,

and thus a following space is swallowed. So either one has to add an additional space into the definition of \do@noligs or one has to change \xobeysp to expand to something like

\def\@xobeysp{\kern\z@\space}%

so that there is always something between the \char and the \space. The latter approach doesn't work for spaces generated by a line break in the source as this is by default translated into a straight space (which would be swallowed). So either one has change the behavior of ^^M (end of line) or one has to use the fix to \do@noligs.

Update 3

Oh well :-) in normal paragraphs (and that was originally requested) spaces at line breaks vanish and this is what the current solution does. If this is not desired then one possibility is to change the definition of \@xobeysp in the code above.

Let's assume we have the following input:

\begin{myverbatim}
xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345,        123,                                   1234567890,

xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345,   \\  123,                                   1234567890,

xxxxxxxxxxx xxxx xxxxx: 1234567890123456789012345,   \\  
       123,                                            1234567890,
\end{myverbatim}

The the current solution above results in

enter image description here

i.e., spaces at the natural and the forced line break vanish. Now if we instead use the following definition:

\def\@xobeysp{\penalty\z@\char 32 \penalty\z@}%

Which typesets the character in position 32 in the font (space) with a break allowed before and after, then we get the following output:

enter image description here

There are penalties on both sides so that a break can be taken before or after the space character. By default the best break will be the last one, which means the spaces tend to end up on the right side, but in an emergency a break can be taken before the first space char in a row.

Alternatively we could use:

\def\@xobeysp{\mbox{}\space\penalty\z@}%

in which case a space will not vanish into the left margin and we get the same results only that spaces are now "blanks again".

enter image description here

(by the way the indentation is 15 points, a space in typewriter seems to be 5.24995pt which means that the indentation is nearly but not quite 3 spaces, so this might be an area for improvement)

Update 4

With the OP further clarifying what he is looking for, a new requirement showed up:

  • an empty line should result in an empty line in the output, i.e., it should not just end the paragraph (as it normally happens in TeX) but explicity generate an empty line and to empty lines thus generate 2 etc.

So I updated the code above once more to include explicit handling of end of line processing. For this ^^M is made active (i.e., calls a command) rather than doing its standard magic. That command then is looking if it is followed by another ^^M. If not it will execute \@xobeysp (generating whatever is set up in the case for a normal space). On the other hand, if another newline character follows (i.e., if we have an empty line) it will generate a \par followed by \vskip\baselineskip to generate a vertical space equivalent to an empty line. It then continues to look for further newline characters to generate more vertical space if necessary.

There two boundary cases that need special handling which make the code even worse: after \begin{myverbatim} there is typically a newline character that should not trigger this behavior and just before \end{myverbatim} there is another one that shouldn't generate a space. The code above handles the first case well. The second is not working correctly if you do not have \end{myverbatim} on a line by itself, but there you go ... this answer is already much longer that it was ever intended to be :-)

If we again use the above example but add two blank lines in we now get the following:

enter image description here