[Tex/LaTex] line break aware custom environment

environmentsline-breaking

I have a use-case where I would like to specify some special formatting for the text being input via the \input command. This text is customer's address being printed in some sort of a form. Ideally the input file should not contain any TeX commands and be aware of line breaks.

So if the input file is:

John Doe
Some Street X
ZIP City

the LaTeX template contains:

 \begin{customer}
   \input{customer}
 \end{customer}

I know how to create a new environment, but how can I preserve line breaks from the customer.tex. I don't want the .tex file to contain \\ for the line breaks. Verbatim environment does not fit well here since it uses an own font and formatting.

P.S.:
A short note: verbatim environment does not work, since it does not preserve the formatting from the customer environment. I need to merge the ability of line breaks into my customer environment. E.g. customer environment can cause the text to be formatted as {\large \textsc{...}}, or whatever else.

Best Answer

That syntax is redundant: something like \custinput{customer} might be better; let's try:

\newcommand{\custinput}[1]{%
  \begingroup\setlength{\parindent}{0pt}\obeylines
  \input{#1}\endgroup}

\custinput{customer}

LaTeX borrows the macro \obeylines from plain.tex: its definition is

\def\obeylines{\catcode`\^^M\active\let^^M\par}

Saying \catcode\endlinechar\active could be safer, but it's not very common that the end-of-line character is different from the default one. Such a command should be issued inside a group, because otherwise its effect will last "forever". In this group the active end-of-line is defined to be equivalent to \par.

A bit of theory: TeX appends a character to each line it reads after disposing of the operating system end-of-record signal. This is because every operating system has its own ideas of how a record should end; some use a combination <CR><LF>, others only <LF>, others only <CR> (<CR> is ASCII 13, <LF> is ASCII 10). Others don't use any character at all, because their records have fixed length and the record is padded by ASCII <NUL> characters or even spaces.

The default end-of-line character appended by TeX is <CR>, but it depends on the value of the integer parameter \endlinechar (whose initial value is 13). If this value is negative, no end-of-line character is inserted: this is sometimes useful in applications. The category code of this character is usually 5, which means that two such characters in succession are converted to \par (spaces at the beginning of a line are ignored), otherwise it's converted to a space. From this follows the property that a blank line ends a paragraph.

Setting the category code of this character \active (that is, 13) allows us to define it to be anything we want; we'll use the default meaning, that is \par, which will do what's intended. A convenient way to refer to ASCII 13 as a character is ^^M. However we couldn't say

\newcommand{\custinput}[1]{\begingroup
   \catcode`^^M=\active \def^^M{something}%
   \input{#1}\endgroup}

because at the moment of the definition the category code of ^^M is not 13. Indeed the real definition of \obeylines is

{\catcode`\^^M=\active % these lines must end with %
 \gdef\obeylines{\catcode`\^^M\active \let^^M\par}%
 \global\let^^M\par} % this is in case ^^M appears in a \write

and this shows why using \obeylines in the definition is "more elegant".