[Tex/LaTex] New lines and TeX: difference between ^^J and ^^M

line-breakingtex-core

What is a new line for TeX in the following contexts:

  1. When reading from a file.
  2. When writing to a file.
  3. After having read a % character.
  4. In a \scantokens.

I am asking in particular because the following code only typesets A:

\documentclass{minimal}
\begin{document}
\catcode`\%=12
\def\foo{\scantokens{A%    

B}}
\show\foo
\catcode`\%=14
\foo
\end{document}

So my main question is: how does % know where to stop gobbling characters?

EDIT: Adding the two lines

\catcode`\^^M=12 
\newlinechar`\^^M

before the definition of \foo is instructive: then the definition actually contains new-lines, and the comment stops gobbling where we expect.

EDIT2: pdflatex sets \newlinechar`\^^J and \endlinechar`\^^M (see Harald's concise answer below for what these are).

Best Answer

  1. I think that what constitutes the end of line upon reading from a file is hardcoded according to whatever operating system you are running on. And that end of line is represented by the character whose number is \endlinechar.
  2. When writing, the character whose number is \newlinechar will trigger the end of a line. Again, the exact result in the output file is hard coded, depending on your operating system.
  3. See #1.
  4. Usually, the argument to \scantokens is treated as a single line. Thus a percent sign in the argument to \scantokens will end input from this argument. However, any occurrences of the character whose number is \newlinechar will be used to split the argument into several lines.

To bring all these ideas together, consider the plain TeX file

\newlinechar=2
{\catcode`\%=12
 \gdef\foo{\scantokens{abc%xyz^^Bdef}}}%
\endlinechar=`X
\foo%
\bye%

which will typeset the text “abcdefX” .

(Edited to take into account what I learned about #4 from the comments.)