[Tex/LaTex] Why are the latex, pdflatex and pdftex binaries identical

tex-core

I apologize if this question is somewhat off-topic but I think that this is the best place to find an answer. This is sort of an oddity that's confused me for years. When I run latex -v I get:

pdfTeX 3.14159265-2.6-1.40.15 (TeX Live 2014)
kpathsea version 6.2.0
Copyright 2014 Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
There is NO warranty.  Redistribution of this software is
covered by the terms of both the pdfTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the pdfTeX source.
Primary author of pdfTeX: Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX).
Compiled with libpng 1.6.10; using libpng 1.6.10
Compiled with zlib 1.2.8; using zlib 1.2.8
Compiled with xpdf version 3.03

which is the same thing that I see if I run pdflatex -v. Diving a little deeper, I find that:

 ls -l /usr/local/texlive/2014/bin/x86_64-linux/latex
lrwxrwxrwx 1 root root 6 Aug 15  2009 /usr/local/texlive/2014/bin/x86_64-linux/latex -> pdftex*

and

 ls -l /usr/local/texlive/2014/bin/x86_64-linux/pdflatex
lrwxrwxrwx 1 root root 6 Aug 15  2009 /usr/local/texlive/2014/bin/x86_64-linux/pdflatex -> pdftex*

So it seems that latex and pdflatex are in fact the same. But obviously they are not. latex produces dvi files and pdflatex produces pdf files, just as they are supposed to. This question was raised before as a "how question": How can latex and pdflatex be both symbolic links to same executable (pdftex) and not behave the same? to which the answer is argv[0].

But why? Why are the programs packaged in the same binary and with the same versioning data? I don't think I've encountered this kind of approach anywhere else–it just seems so hacky. Why not maintain separate executables instead of a single executable with an argv switch-case?

Best Answer

It is no different to asking why two java executables use the same java runtime. plain tex and latex both need a tex engine to execute the code, they differ in which format file (ie a memory dump of definitions) they load. Classically that was specified by a commandline argument tex &plain v tex &latex but as a convenience web2c implementations can use the program name to default the format. Similarly the output format (dvi or pdf) can be specified on the command line (or using tex syntax within the file) but the program name can be used to default it.

Note it's really not two different executables packaged as one, it is just defaulting the commandline arguments, for example

pdflatex '\pdfoutput=0 \input' file

or

pdflatex --output-format=dvi file

use commandline arguments to produce dvi files using pdflatex.

The behaviour is identical to

latex file

It is just that --output-format gets a different default if the command name is latex.