Short answer is that it's not possible.
There are some tools that do some things but they can not really analyse the latex document and so any advice they give should only be taken as hints, it might be wrong.
The big difference between LaTeX and the languages that you mention like C and Java is that the syntax of LaTeX can not be analysed, even the basic lexical analysis and tokenisation of the input depends on run time behaviour.
\section[abc}
Looks like it might be a syntax error that you might expect a static analysis to pick up but the document might be
\documentclass{article}
\ifodd\time\catcode`[1\fi
\begin{document}
\section[abc}
aa
\end{document}
which means that it is or is not a valid document depending on the number of minutes since midnight. This is obviously an extreme case but not as extreme as you may think. Lots of packages do similar things that change the analysis of the document, think of babel shorthands for example. The fact that babel has been loaded can be statically detected by inspecting the preamble, but determining which language is in force at any point really requires running a full LaTeX interpreter.
Even if it were possible I'd question if some of your items really should be flagged.
- unreachable code:
\if\else\fi
constructions where one or more paths can never be reached?
The difficulty here is determining which tokens are in fact tests, mostly you do not see Tex primitives such as \if
But tokens defined via \newif
which are harder to recognise by a checker. It could perhaps assume that every token starting \if..
is an if token in this sense but for example LaTeX \ifthenelse
starts with \if....
but has a very different syntax.
- inefficient loops, like
\foreach
's that could be simplified?
\foreach
is simply a macro so almost by definition any particular use of it can be simplified by expanding out the macro. But that may not be seen as simplification...
LaTeX and all its packages are macro definitions and most documents don't use most of the commands defined, so there are typically thousands of unused macros in any given document.
- suggesting the use of
\newcommand*
instead of \newcommand
where appropiate?
I'm not sure how this could be done unless you record every use of the macro in a given document and note that it never takes par in that case,
- suspicious lack of possible brackets or whitespaces? Like:
a^b c
- clear
a^{bc}
- clear
a^bc
- suspicious: renders like 1. but maybe 2. was intended?
I'd disagree with this check. 2. is the standard latex syntax. If you decide to allow 1. then you should allow 3 as well without comment. It's a central part of the design of TeX math mode syntax that white space is not significant other than terminating command names.
- suspicious empty lines (paragraphs)? For example between text and equation?
TeX goes to some trouble to distinguish the case that the text following a display is or is not a new paragraph, and LaTeX emulates this behaviour for all its list environments.
So unless the static analyser is interpreting the sentences and suggesting that it should not be the start of a paragraph it should not be commenting on blank lines.
- suspicious or missing [end line comments][1]?
Yes, so long as it can recognise the start of latex3 syntax or similar packages that change the rules and mean %
is not necessary.
- whatever else you can think of that is very often wrong or inefficient or unclear to the human eye?
getting a human to proofread the document is a good idea, human eyes are still better at this than machines:-)
Best Answer
Here is a list:
Using math-related content without specifying it explicitly as being math:
While the compiler may recover from this, the output doesn't resemble the expectation:
Moreover, TeX complains with the following - very typical - error:
To fix this error, explicitly introduce a switch to/from math mode by using `$`..`$` or `\(`..`\)` (for inline math content). That is, use `Let's do $\alpha$ and then $\beta$.`.When intermixing text and math, some people use
\mbox
to box the text-related stuff. The reason being that\mbox
necessarily switches to text mode. So, if you have math content within\mbox
, you need to explicitly restate its use. Here's a short example (causing aMissing $ inserted
error):The symbol
\geq
requires math mode. Moreover, the spacing is off if the math content is not set in math mode (but that's typographic in nature).This is error is fixed by using something like
\mbox{where $x\geq 0$}
. Better yet, theamsmath
package provides\text
which scales the text to the appropriate font size based on its context while\mbox
does not.Code readability is encouraged, which is sometimes synonymous with leaving white space within your code. In the following elementary example, however, leaving (too much) white space causes an error (
Missing $ inserted
):(Sure, any one blank line removed in the above would also cause the problem. But it seems more hidden due to a symmetric blank line around the equation.)
This can be avoided by either removing the blank lines, or introducing comment characters
%
. See Blank lines inalign
environment.When typesetting regular delimiters, there is no need to match the pairs. For example, you can have expressions like
(a,b]
or even|a,b
. However, when using extensible delimiters you need to define a pair using\left
and\right
. If you want to only use a single extensible delimiter, then you still have to define a paired delimiter using.
, as in the following example:In the above example, even though no right delimiter is needed, it has to be specified using
\right.
.On that note, the following is also a common mistake leading to errors when delimiting using
{ }
(as in the above example - forgetting to escape{
):The correct usage is
\left\{
.Or, if you plan on using a notation like
]a,b[
(for whatever reason) and you want extensible delimiters, using\right]
and then\left[
just because the brackets face outward seems logical, but doesn't work:Left delimiters use
\left
and right delimiters\right
, regardless of the delimiter orientation.It is definitely intuitive to use
\\
to cause a line break. And, this works in some cases. However, in other cases, LaTeX complains about the fact thatThere's no line here to end
. The proposed complication is a purely unexpected line break on LaTeX's end.In the above example, the proposed quick-fix solution is to modify the line break
\\
to\hspace*{\fill} \\
.Here is another example:
A solution is provided by modifying the double line breaks
\\
into a single line break with an optional length skip\\[\baselineskip]
.You are learning how to write macros or your own commands and try something like:
only to see LaTeX complain
Command \i already defined
. But you never defined it in any other way, how can it be defined already?! The reason is that the document class itself can define certain commands, just like any package that is loaded can define its own macros (and environments). Moreover, TeX also defines some macros by default (around 900 or so). Specific to this case, it is possible to see what\i
was defined as previously and how to correct for this new definition usingor one can use
\show\i
and inspect the.log
file to see the "formal definition."Another solution, provided by LaTeX2e, is to use
\providecommand
rather than\newcommand
or\renewcommand
, since it has a built-in existence check. Or, one can use TeX's\def
command which has a slightly different interface. However, use these with caution, since commands defined by packages or classes have specific meaning and redefinition without the knowledge of it could have disastrous consequences.You find a snippet of LaTeX code online and wrap it into a minimal document:
However, LaTeX is not happy and complains about
Missing \begin{document}
. Obviously it's there... so what's the problem? Well, the@
symbol is a reserved when it comes to macro definitions and can't be used without some precaution.The category code for
@
needs to be changed in order for it to be used in a macro definition. Using\makeatletter
will "make@
have 'letter'/11 as a category code" so that it can be used. Usually, it is accompanied by a\makeatother
counterpart, which "makes@
have 'other'/12 as a category code".You wish to define a command that does some formatting and decide to define a similarly-named environment that does the same thing, perhaps for the sake of convenience:
LaTeX complains that
Command \dosomething already defined
atl.3 \newenvironment{dosomething}{\bfseries}{}
(line 3) even though you defined\dosomething
only once (in line 2).The problem here is that when one uses
\newenvironment{myenv}{<beg env>}{<end env>}
, LaTeX defines two commands\myenv
and\endmyenv
, respectively using the<beg env>
and<end env>
definition. To correct for this, use unique names for commands and environments, even if they perform very similar tasks. For example, in the above case, define the command\boldcmd
and environmentboldenv
, say.You start playing around with graphics and would like to include a picture using the
graphicx
package:but LaTeX complains that "Can be used only in preamble." This is not because of the use of
\includegraphics
(which is correct), but the required/included package (graphicx
). The document structure requires certain "things" to be used only in certain places. In this case, you cannot load a package inside thedocument
environment.Place
\usepackage{graphicx}
between\documentclass{article}
and\begin{document}
- the document preamble.Here
\textit
is passed as an argument to\textbf
. However, you attempt to do the same with so-called verbatim content:and receive the error "
\verb
illegal in command argument."\verb
and friends are delicate and should be handled with care. It deals with the advanced topic of category codes and is perhaps best described in the UK TeX FAQ entry Why doesn’t verbatim work within …?, together with appropriate solutions/alternatives. Most notably asking yourself the question whether using verbatim is actually necessary.amsmath
'salign
environment after reading\eqnarray
vs\align
and requests a duplication of the Taylor expansion of\sqrt{1-x}
:Ambitious in your attempt, you try:
(La)TeX spits out an error
Extra }, or forgotten \right
.\left
and\right
are delimiter counterparts and should always appear together and in the same (nested) group in math mode. In the above example,\left
and\right
are in different groups, not only spanning thealign
columns, but also rows. In instances like these, its much more convenient to use the separable\Bigl
and\Bigr
delimiter extenders. Alternatives also include using\vphantom
to scale the separate groups vertically to the maximum size while using the null delimiters\right.
and\left.
where needed. See\left
/\right
across multiline equation.Working with tables, you wish to insert some form of row labelling in the first column using a
[..]
style notation:LaTeX complains about "Missing number, treated as zero" and "Illegal unit of measure (pt inserted)". These errors are related to the optional argument that can be specified with the control sequence
\\
;[b]
is considered to be the optional argument to\\
, effectively trying to evaluate\\[b]
, whereb
is expected to be a length/dimension. This is not the case, so it's substituted for zero (as the first error message reports), yet there is no dimension associated with it (as reported by the second error message). One reference to such an instance is given in Allow[
characters in tables without entering math-mode (from R).The work-around here is to insert a non-
[
element at the appropriate location to avoid LaTeX from grabbing them as optional arguments to\\
. For example{}[b]
would do.Experimenting with headers/footers, you attempt to stack some content in order to provide more information to the reader:
LaTeX complains about a
yet everything seems fine.
This problem is partially related to (2), but stems from the fact that there are informally two different definition types in TeX: those that allow paragraph breaks, and those that don't. The former are macros defined using
\newcommand
or\long\def
while the latter are those defined using\newcommand*
or\def
. As reference, see What's the difference between\newcommand
and\newcommand*
?In order to get around this, don't use multiple paragraphs inside macros that can't handle them. This particular solution could be avoided using
\thepage \\ My Name
which technically sets a single paragraph. One could also use atabular
to provide the stacked look.In an attempt to speed up your coding, you start using abbreviations for certain regularly-used constructions, like in the following minimal example:
One would think that the replacement texts for
\bv
and\ev
would be inserted in the input stream, yielding the expected outputIn general, this works, as TeX merely inserts the replacement text of the given abbreviations into the input stream. Their replacement executions follow the expected syntax and everything works out fine. In some instances though, it doesn't. The above is one of them, although there are others as well. This feature lies with the coding of the environment where it expects an explicit
\end{<env>}
form and scans until that is found. Follow the suggested work-around in the package documentation, if this exists. In other instances it's better to avoid the short hand altogether.