[Tex/LaTex] What’s the best practice way to test whether parameter is empty

best practicesparameterstex-core

I have a macro that might work either in LaTeX or in plain TeX and I'd like to test whether one of its parameters, which presumably should be just text, is empty. Currently I do it this way:

\bgroup
\setbox0=\hbox{#1}
\ifdim\wd0=0pt
    it is empty
\else
    it is not empty
\fi
\egroup

And I wonder whether this is “proper” way or whether there is some other, more “best practices like”, one.

Thx.

Best Answer

The method of the question

\bgroup
\setbox0=\hbox{#1}
\ifdim\wd0=0pt
    it is empty
\else
    it is not empty
\fi
\egroup

has several problems, if used for a general test:

\setbox0\hbox{#1} leak color specials, when #1 contains top level \color commands, because the color macros use \aftergroup to reset the color after the current group (\hbox). This can be fixed for both plain TeX and LaTeX by using an additional group:
```
\setbox0=\hbox{\begingroup#1\endgroup}
```
In LaTeX \sbox0{#1} can be used.

#1 can contain material, but the overall width is zero, examples:

\hbox{$\not$}% the width of the glyph `\not` is zero so that `\not=` forms the "not equals" symbols
\hbox{\rlap{Text}}

#1 can contain so much material, that the width overflows. TeX does not throw an error, but the width can be zero accidentally again.
The width of the \hbox depends on the font. For example, TikZ sets \nullfont in its environments, thus the width would always be zero.
\bgroup and \egroup are the macro form for { and }. They have a serious side effect in math, that they form a math subformula, which behaves as math ordinary atom and influences the horizontal spacing. This can be fixed by using \begingroup and \endgroup.

Test based on macro definition

#1 can be put into a macro and then the macro can be tested for emptiness:

\def\param{#1}%
\ifx\param\empty
  The parameter is empty.%
\else
  The parameter is not empty.%
\fi

This is not perfect yet, because #1 might contain #, which is problematic in macro definitions, because they need to be doubled and numbered. This can be avoided by the use of a token register:

\toks0={#1}%
\edef\param{\the\toks0}% No full expansion, the token register is unpacked only
\ifx\param\empty
...

In LaTeX \@empty can be used, but it also provides plain TeX's \empty.

The side effects of setting \toks0 and defining \param can be removed by using a group:

\begingroup
  \toks0={#1}%
  \edef\param{\the\toks0}%
\expandafter\endgroup
\ifx\param\empty
...

This solution works in both plain TeX and LaTeX; e-TeX is not used. Because of the assignment and definition the code is not expandable.

Expandable tests

If e-TeX is available, \detokenize allows a safe expandable method, see also the answer of PhilipPirrip:

\if\relax\detokenize{#1}\relax
  The parameter is empty.%
\else
  The parameter is not empty.%
\fi

Because \detokenize converts is argument to simple characters with category code other (same as digits) and the space (in case of the space), it does not contain any command tokens and other problematic stuff, which could break \if.

Without e-TeX the test should not use \if, but \ifx:

\ifx\relax#1\relax
...

However, a macro with meaning \relax might be present at the start of parameter #1. Therefore \relax should be replaced by something, which is unlikely to be used in #1, examples:

\def\TestEmptyFence{TestEmptyFence}

\ifx\TestEmptyFence#1\TestEmptyFence
...

Or Donald Arseneau (url.sty, ...) often uses a character with unusual catcode:

\begingroup
  \catcode`Q=3 %
  \gdef\MyEmptyTestMacro#1{%
    \ifxQ#1Q%
    ...
   }%
 \endgroup % restore catcode of Q

However, these expandable tests without e-TeX can be broken, e.g., if #1 contains unmatched \if commands. To reduce these problems, see the much more elaborate answer of Ulrich Diez.

I have marked the best solutions, non-expandable without e-TeX and expandable with e-TeX, by adding a quote environment to the code block.

Improvement of the if branches

\def\foobar#1#2#3{%
  \if...
    #2% if true
  \else
    #3% otherwise
  \fi

can be improved, because the code has the limitation, that #2 is followed by \else and #3 is followed by \fi. Thus both #2 and #3 cannot contain macros at the end, which expect following parameters, e.g.:

\foobar{...}{\textit}{\textbf}{Hello}

Instead of Hello, \textit gets \else and \textbf gets\fi` as parameter, breaking the code.

The standard way is finishing the \if construct first and selecting the argument via \@firstoftwo and \@secondoftwo:

\def\foobar#1{%
  \if...
    \expandafter\@firstoftwo
  \else
    \expandafter\@secondoftwo
  \fi
}

The \expandafter closes the current if branch first. The macros \@firstoftwo and \@secondoftwo are defined in LaTeX:

\long\def\@firstoftwo#1#2{#1}
\long\def\@secondoftwo#1#2{#2}

Plain TeX does not have an equivalent, thus they need to be defined there.

Related Solutions

[Tex/LaTex] Test whether token register is empty

At time of writing the other TeX based answers on this page are flawed in that they hide a conditional \ifx inside a macro but still use \else/\fi at the "top" level. This will mean that things break unexpectedly when used inside other conditionals.

The LaTeX3 programming language expl3 contains a module for doing stuff with token registers:

\usepackage{expl3}
...
\ExplSyntaxOn
\toks_if_empty:NTF \mytoks {true} {false}
\ExplSyntaxOff

It essentially does internally what the other answers here are suggesting, but it uses expansion to grab its arguments so the branching is robust (and you don't have \fi lying around to get in your way).

Update: So what does this approach do that is superior to other methods? Consider the style of solution first offered in answer to this question:

\def\IfEmpty#1{%
  \edef\1{\the#1}
  \ifx\1\empty
}
...
\IfEmpty\foo T\else F\fi

This doesn't behave nicely when nested, because TeX scans ahead when discarding unfollowed branches of a conditional. Consider

\ifx\bar\baz
  \IfEmpty\foo T\else F\fi % <- uh oh
\else
  E
\fi

If \bar = \baz, then the second branch is discarded and the first branch is executed. So far so good. If \bar ≠ \baz, then the first branch is discarded by reading ahead until the first unmatched \else — and this is the one in the line labelled "uh oh" above. So you could collapse the expansion of the above snippet in this case to:

\iffalse\else F\fi % <- uh oh
\else
  E
\fi

and hence the cause of the ‘Extra \else’ error message in this case.

So this form for conditionals doesn't work so well. Next try. You can also write this style of code like this:

\def\IfEmpty#1#2#3{%
  \edef\1{\the#1}
  \ifx\1\empty
    #2%
  \else
    #3%
  \fi
}

This avoids the problems of nesting as in the previous trial solution, but it's prone to another problem: #2 and #3 have trailing material behind them, namely \else and \fi. This is a problem if you want to write something like

\def\processfoo#1{...something with #1...}
\IfEmpty\foo{\error}{\processfoo} {arg}

because the #1 passed to \processfoo will be \fi instead of the desired {arg}. The conditional in this case is better written as

\def\IfEmpty#1#2#3{%
  \edef\1{\the#1}
  \ifx\1\empty
    \expandafter\@firstoftwo
  \else
    \expandafter\@secondoftwo
  \fi
  {#2}{#3}
}

so overcome this problem. This is how expl3 conditionals work, and it's why we're writing TF at the end of all their names to indicate "true" and "false" branches. (Or just T or just F if you only want one of them.)

Incidentally, there are expandable tests for checking for emptiness, which is why I suggest using the expl3 approach for this test. Expandability is not always required, of course, but code that is fully expandable tends to be more reliable and it's always nice to have for cases such as

\typeout{ \toks_if_empty:NT \foo {Warning:~\string\foo\space is~ empty} }

[Tex/LaTex] Naming LaTeX files: best practice

It depends a little if this files are only used by you are if you are planning to share them with other people, e.g. if you work with them on one document or the files are part of a LaTeX package.

But in general I would strongly recommend you to limit yourself to lowercase alphanumeric ASCII characters, i.e. a-z, 0-9 and -.

The reasons for that are:

Unicode or other non-ASCII characters can cause trouble when copied on a different file system with a different code page. I had the case that I couldn't even see files with German umlauts on a Windows share mounted by Linux. This is now much better, but still a risk. The normal TeX doesn't like non-ASCII character that much either.
Some file systems (FS) are case sensitive (e.g. under *nix FSs) others aren't (e.g. FAT, NTFS). If you keep you file names all lowercase you avoid collisions between files which can lead to loss of data when copied from a case sensitive to a case insensitive FS. Also you will run intro trouble on case sensitive systems when the actual filename has a different capitalization as on the hard drive. You might not realize that on e.g. Windows, but it will hit you then hard on a different FS.
Characters which are special in TeX will work as long they are valid at this position which excludes % and #. Others as & can cause trouble as well and there is no real reason to use them, so avoid them. Even _ which is commonly used and will work inside \input can cause trouble when the filename should be printed, so avoid it as well.
Spaces are "evil" in filename as well because some external tools will take them as file name separator. TeX should be fine with them, except when they used multiple times in a row. TeX will then combine them to one prior to the interpretation as a filename!
Dots will confuse the simple extension extraction algorithms used by LaTeX. See this question for an example.

I'm going through some effort in the svn-multi package to allow for arbitrary file names. This is done using verbatim mode which doesn't help for other input macros like \input, \include or \includegraphics.