[Tex/LaTex] What’s the point of INS and DTX files

dtxpackages

I just downloaded the installation files for a LaTeX package from CTAN. They mostly comprised a load of DTX files, and an INS file. The instructions said to run latex on the INS file, and I did. It produced a load of STY files, and a message appeared telling me to copy these STY files to my TeX directory tree. I did as instructed, and everything works fine.

But what was the point of all this DTX/INS shenanigans? Why aren't STY files just made available for immediate download in the first place?

Best Answer

The starting point for having a 'source' format is that it's a good idea to have code comments and it's also a good idea to have user documentation. The simplest way to do that of course is to put it all directly in the .sty file.

% Some info for the user
% ...
% Some code comment
\a \code \line

That is seen a a good number of packages.

The next consideration is that as TeX is a typesetting system we'd like to be able to typeset the documentation, not just read the 'raw' sources. That can be arranged by setting up the comments appropriately and reading in the source using some form of 'driver' (a small file that sets up to typeset rather than use the code/documentation).

This brings us to two partly-historical considerations. First, if comments are in the source, TeX has to read those lines each time the code is used. In general, the code will be used many more times than the documentation is read, so it makes sense to speed the process up if possible. Historically, stripping out comments made a significant different here, so this was worth it: today you probably don't notice the improvement very much!

One could split the user documentation into a separate file from the code, and again this is common, particularly for larger packages. However, that means more files. Again, historically sources were exchanged by 'direct' methods (email, etc.), and so minimising the number of files had a strong point. This is of course not so much of a consideration today.

All of the above assumes our source extracts to exactly one .sty file. However, by using the DocStrip system we can extract multiple files from one source, reorder lines, etc. For example, if we have several related files (e.g. drivers for graphics inclusion, input encoding support, ...), they will have shared lines and code comments plus some unique lines. Using a .dtx we can have them in one place and extract out the separate parts. We can also combined multiple sources into one .sty: file loading depends on the number of files used, so there is a benefit in putting a large code block together (sources don't want to be too long!). This is seen in for example expl3 or the LaTeX kernel itself.


As the above details, there are technical and historical reasons for using the .dtx format. As it's used by the LaTeX team, there is a reasonably strong sense of 'community by-in' for the approach. Notably, for almost all end users today this is not a significant factor: the vast majority of users get code pre-extracted in TeX Live/MiKTeX, and CTAN also hold installable .tds.zip files for many packages. So other than developers, actually using .dtx files is not something that is widely done today.

Related Question