[Tex/LaTex] Input/Output primitives of TeX

inputoutputtex-core

I'd like to write some simple macros for handling things like cross-reference, to use them in plain TeX. (I know there are already macros devoted to this, like the ones present in Eplain, but I would like to try something different by myself.)
So I need to know how to read from a file and how to write to a file. What are the TeX primitives that do such things? How do they work?

Another question: Can TeX 'call' other programs while it is running? I mean: Is there in TeX an equivalent to the system function present in the C language?

Best Answer

TeX has the \read and \write primitives for reading and writing to files plus of course \input for inputting an entire file 'here'. If you look at for example the LaTeX cross-ref mechanism is uses \write but avoids using \read (line-by-line) in favour of making use of \input with appropriately designed secondary files.

As \input is easy enough to understand, lets focus on \read and \write. Both of these work on a file stream, which is given a number but is usually allocated using \new.... For example

\newread\myread
\openin\myread=myinput %

\newwrite\mywrite
\immediate\openout\mywrite=myoutput %

will set up a read called \myread and a write called \mywrite. Notice that I've used \immediate with the \write: due to the asynchronous nature of the TeX page builder, you need to make sure that you ensure that \write operations happen in the 'correct' place. (More on this below.)

With two streams open we can for example write to the output. If we do two writes, one 'now' and one 'delayed'

\def\foo{a}
\immediate\write\mywrite{\foo}
\write\mywrite{\foo}
\def\foo{b}
Hello
\bye

the result is myoutput.tex reading

a
b

That's because \write\mywrite produces a whatsit that is only executed when a page is shipped out. That's useful if for example what you need to write contains a page number, as that is only known during the output stage. Also notice that \write acts like \edef: everything gets expanded unless you prevent it using \noexpand or a toks. Note, however, that this expansion is performed at the moment the \write operation is actually executed, so one must ensure macros have proper definitions when using a delayed \write.

The \read primitive reads one line at a time (unless braces are not matched) and tokenizes in the normal TeX way. You can arrange to loop over a file one line at a time using the \ifeof test on \myread, but as I say it's often easier to simply \input a file containing cross-refs.

If you want to do a system call, 'pure' TeX doesn't really help. However, Web2c has for a long time had a special 'stream' to allow escape to the system: \write18. This is a security risk and so as standard only a restricted set of commands are allowed in such an escape. You can do for example

pdftex --shell-escape myfile

to allow all escape: the risk if you've written all of the code yourself is only in making a mess-up! Doing a \write18 doesn't feed anything back to TeX: you'll need to arrange to read the result in some way, probably using \read on a secondary file.

As noted in a comment, an additional syntax extension available is \input|"<command>". This is again restricted by \write18 but does provide an expandable method to grab input from shell commands.

Related Question