You want to keep in mind all the irregularities in your text: places where superficially uniform text turns out not to be uniform. You give periods in abbreviations as examples; the reason they are is that a period does not usually go inside a sentence, but in this case, it does (and only a speaker of idiomatic English could know that, not TeX). The differential in an integral is another example: it is the same text as the integrand, but it is not part of the integrand (in this case, I could imagine TeX being written to look out for this, but there are probably good reasons it doesn't).
You should, of course, also look out for constructions which are superficially different but in fact are not. These may center on certain TeX idioms: for example, if you were programming in plain TeX (which you are not, and so you should not actually write this ever) you might do the following:
The following {\it italic text} is not well-spaced.
If you set that, you may see that the word text is a little too close to "is". In the TeXbook, Knuth reminds you to put in an "italic correction". However, now that LaTeX has \textit{...}
, which takes care of this, you probably never even learned what an italic correction is. So this is a non-example. The point remains, however, that certain TeX constructions break the flow of the text (in particular, grouping) and you need to pay attention to the typeset result to see if they broke the spacing.
Vertical space can also be an issue, and harder to deal with. Knuth also warns against using tall symbols in the text (like \frac{1}{2}
instead of 1/2
) because they force the lines apart. Thus, you need to scrutinize all the inline math you write for tall symbols, and consider using displayed equations. Sometimes you can work around this using \smash
if you know there is space and TeX doesn't.
Inline math causes another problem with TeX's line breaking algorithm, because it won't break at a lot of places in an equation, commas being the notorious example. Thus, write $a$, $b$, and $c$
rather than $a, b, \text{ and } c$
or even $a, b$, and $c$
. Knuth also wants you to put a tie in: and~$c$
; I confess that I never use ties. Like manual spacing corrections in equations, they seem like they should be reserved for final polishing (I mean, if Dr. House
is in the middle of a line, it's not going to break).
In short, you need to watch for scope changes, mode changes, and changes in "semantic scope", where the last one is totally impossible to communicate to TeX and the other two are still insidious. However, you should not be afraid to "just try it" and see whether you really do have a problem. It is much faster to let TeX do whatever it does (and with TeX, "whatever it does" is sometimes all you can say easily) than to try to anticipate it.
If whatsits
are aptly named is a matter of opinion, as I think they would fit better in TeX's semantics if they were called afterallnodes
; whatsits
represent commands whose execution is delayed or are special commands associated with a particular device or system and are not part of TeX's normal processing flow.
It is interesting to investigate Knuth's rationale for introducing them. In a meeting with NTG members on March 13th, 1996 Knuth in reply to a question said:
I tried to make the programs so that they would have logical structure
and it would be easy to throw in new features. This hasn’t happened
anywhere near as often as I thought because people were more
interested, I think, in inter-changeability of what they do; once you
have your own program, then other people don’t have it. Still, if I
were a large publisher, and I were to get special projects— some
encyclopaedia, some new edition of the Bible, things like that — I
would certainly think that the right thing to do would be to hire a
good programmer and make a special computer system just for this
project. At least, that was my idea about the way people would do it.
It seems that hasn’t happened very much, although in Brno I met a
student who is well along on producing Acrobat format directly in TeX,
by changing the code. And the Omega system that you mentioned, that’s
150,000 lines of change files [laughter].
I built in hooks so that every time TeX outputs a page, it could come
to a whatsit node and a whatsit node could be something that was
completely different in each version of TeX. So, when the program sees
a whatsit node, it calls a special routine saying, ‘how do I typeset
this whatsit node?’ It’ll look at the sub-type and the sub-type might
be another sub-type put in as a demo or it might be a brand new
sub-type.
A whatsit
can appear in either a horizontal or a vertical list and has no dimensions. It signifies an operation that should be delayed as it doesn't fit in its ordinary scheme of things. The paragraph builder and the page builder scan lists submitted to them and execute certain types of whatsit
. They are useful when associated with specific implementations.
The more common whatsits
are the ones associated with the main vertical list:
(a) delayed writes generated by \write
. The token list of a delayed \write
is not written-out until the surrounding material of a \write
makes it to the output routine where a \shipout
is executed. Therefore, the write token list has to be stored on the main vertical list.
(b) specials that use the \special
command. The token list of a special command is stored with the main vertical list because the token list needs to be written to the dvi file. This happens, as in the case of write at the time of shipout.
Practical implementations can be found in postcript, pdf, color drivers and graphics programs. An interesting read is always the hyperref manual. The package uses \specials
extensively to implement the interface between TeX commands and the PDF page description language. They are very simple to write:
\immediate\special{!pdfpagelabels #2}%
To summarize it is a free for all hook/interface. Why they were called whatsit
-- my guess is that it was a Knuth (ala \fi
) whatisit
. This simple innocent special
command enabled TeX to survive and adapt over the years, producing output from postcript
to PDF
and introducing color and graphics.
Best Answer
In addition to @Josephs's points, things to be aware of:
\unskip
acts on horizontal and vertical space, so it will remove vertical space if used between paragraphs.Unlike
\ignorespaces
which affects the conversion between input characters and tokens,\unskip
works on the actual lists inside boxes, after all tokenisation and commands have been executed.\unskip
can not be used in outer vertical mode: Once an item has been added to the main vertical list of a page it can not be removed. So while in a minipage you can remove preceding vertical space with\unskip
, on the main page you have to use\vskip-\lastskip
to back up over the previous skip rather than actually removing it. This leaves breakable glue so you may also need to inject some\penalties
to prevent page breaking.Consider:
Box 0 is
But suppose (as in box 2) That the code adding
b
needs to remove space above, it could use\unskip
which literally removes it, resulting inIf instead of removing it, negative space is added to compensate as n box 4 then you get
This looks the same but if the penalty was not 10000 then it would be a feasible breakpoint at that position, which means that if the list was unboxed the end of the first part would have depth 0 rather than the depth of
g
which can have subtle (or not so subtle) affects on positioning that are hard to correct (or at least hard to remember to correct).So the issues surrounding
\unskip
are a lot simpler than thevskip-\lastskip
combination, however if you are not in a box, you don't have a choice, as the last version on the main page produces: