At time of writing the other TeX based answers on this page are flawed in that they hide a conditional \ifx
inside a macro but still use \else
/\fi
at the "top" level. This will mean that things break unexpectedly when used inside other conditionals.
The LaTeX3 programming language expl3 contains a module for doing stuff with token registers:
\usepackage{expl3}
...
\ExplSyntaxOn
\toks_if_empty:NTF \mytoks {true} {false}
\ExplSyntaxOff
It essentially does internally what the other answers here are suggesting, but it uses expansion to grab its arguments so the branching is robust (and you don't have \fi
lying around to get in your way).
Update:
So what does this approach do that is superior to other methods? Consider the style of solution first offered in answer to this question:
\def\IfEmpty#1{%
\edef\1{\the#1}
\ifx\1\empty
}
...
\IfEmpty\foo T\else F\fi
This doesn't behave nicely when nested, because TeX scans ahead when discarding unfollowed branches of a conditional. Consider
\ifx\bar\baz
\IfEmpty\foo T\else F\fi % <- uh oh
\else
E
\fi
If \bar
= \baz
, then the second branch is discarded and the first branch is executed. So far so good. If \bar
≠ \baz
, then the first branch is discarded by reading ahead until the first unmatched \else
— and this is the one in the line labelled "uh oh" above. So you could collapse the expansion of the above snippet in this case to:
\iffalse\else F\fi % <- uh oh
\else
E
\fi
and hence the cause of the ‘Extra \else
’ error message in this case.
So this form for conditionals doesn't work so well. Next try. You can also write this style of code like this:
\def\IfEmpty#1#2#3{%
\edef\1{\the#1}
\ifx\1\empty
#2%
\else
#3%
\fi
}
This avoids the problems of nesting as in the previous trial solution, but it's prone to another problem: #2
and #3
have trailing material behind them, namely \else
and \fi
. This is a problem if you want to write something like
\def\processfoo#1{...something with #1...}
\IfEmpty\foo{\error}{\processfoo} {arg}
because the #1
passed to \processfoo
will be \fi
instead of the desired {arg}
. The conditional in this case is better written as
\def\IfEmpty#1#2#3{%
\edef\1{\the#1}
\ifx\1\empty
\expandafter\@firstoftwo
\else
\expandafter\@secondoftwo
\fi
{#2}{#3}
}
so overcome this problem. This is how expl3 conditionals work, and it's why we're writing TF
at the end of all their names to indicate "true" and "false" branches. (Or just T
or just F
if you only want one of them.)
Incidentally, there are expandable tests for checking for emptiness, which is why I suggest using the expl3 approach for this test. Expandability is not always required, of course, but code that is fully expandable tends to be more reliable and it's always nice to have for cases such as
\typeout{ \toks_if_empty:NT \foo {Warning:~\string\foo\space is~ empty} }
It depends a little if this files are only used by you are if you are planning to share them with other people, e.g. if you work with them on one document or the files are part of a LaTeX package.
But in general I would strongly recommend you to limit yourself to lowercase alphanumeric ASCII characters, i.e. a-z, 0-9 and -
.
The reasons for that are:
- Unicode or other non-ASCII characters can cause trouble when copied on a different file system with a different code page. I had the case that I couldn't even see files with German umlauts on a Windows share mounted by Linux. This is now much better, but still a risk. The normal TeX doesn't like non-ASCII character that much either.
- Some file systems (FS) are case sensitive (e.g. under *nix FSs) others aren't (e.g. FAT, NTFS). If you keep you file names all lowercase you avoid collisions between files which can lead to loss of data when copied from a case sensitive to a case insensitive FS. Also you will run intro trouble on case sensitive systems when the actual filename has a different capitalization as on the hard drive. You might not realize that on e.g. Windows, but it will hit you then hard on a different FS.
- Characters which are special in TeX will work as long they are valid at this position which excludes
%
and #
. Others as &
can cause trouble as well and there is no real reason to use them, so avoid them. Even _
which is commonly used and will work inside \input
can cause trouble when the filename should be printed, so avoid it as well.
- Spaces are "evil" in filename as well because some external tools will take them as file name separator. TeX should be fine with them, except when they used multiple times in a row. TeX will then combine them to one prior to the interpretation as a filename!
- Dots will confuse the simple extension extraction algorithms used by LaTeX. See this question for an example.
I'm going through some effort in the svn-multi
package to allow for arbitrary file names. This is done using verbatim mode which doesn't help for other input macros like \input
, \include
or \includegraphics
.
Best Answer
The method of the question
has several problems, if used for a general test:
\setbox0\hbox{#1}
leak color specials, when#1
contains top level\color
commands, because the color macros use\aftergroup
to reset the color after the current group (\hbox
). This can be fixed for both plain TeX and LaTeX by using an additional group:In LaTeX
\sbox0{#1}
can be used.#1
can contain material, but the overall width is zero, examples:#1
can contain so much material, that the width overflows. TeX does not throw an error, but the width can be zero accidentally again.The width of the
\hbox
depends on the font. For example, TikZ sets\nullfont
in its environments, thus the width would always be zero.\bgroup
and\egroup
are the macro form for{
and}
. They have a serious side effect in math, that they form a math subformula, which behaves as math ordinary atom and influences the horizontal spacing. This can be fixed by using\begingroup
and\endgroup
.Test based on macro definition
#1
can be put into a macro and then the macro can be tested for emptiness:This is not perfect yet, because
#1
might contain#
, which is problematic in macro definitions, because they need to be doubled and numbered. This can be avoided by the use of a token register:In LaTeX
\@empty
can be used, but it also provides plain TeX's\empty
.The side effects of setting
\toks0
and defining\param
can be removed by using a group:This solution works in both plain TeX and LaTeX; e-TeX is not used. Because of the assignment and definition the code is not expandable.
Expandable tests
If e-TeX is available,
\detokenize
allows a safe expandable method, see also the answer of PhilipPirrip:Because
\detokenize
converts is argument to simple characters with category code other (same as digits) and the space (in case of the space), it does not contain any command tokens and other problematic stuff, which could break\if
.Without e-TeX the test should not use
\if
, but\ifx
:However, a macro with meaning
\relax
might be present at the start of parameter#1
. Therefore\relax
should be replaced by something, which is unlikely to be used in#1
, examples:Or Donald Arseneau (
url.sty
, ...) often uses a character with unusual catcode:However, these expandable tests without e-TeX can be broken, e.g., if
#1
contains unmatched\if
commands. To reduce these problems, see the much more elaborate answer of Ulrich Diez.I have marked the best solutions, non-expandable without e-TeX and expandable with e-TeX, by adding a quote environment to the code block.
Improvement of the if branches
can be improved, because the code has the limitation, that
#2
is followed by\else
and#3
is followed by\fi
. Thus both#2
and#3
cannot contain macros at the end, which expect following parameters, e.g.:Instead of
Hello
,\textit
gets\else
and\textbf
gets\fi` as parameter, breaking the code.
The standard way is finishing the
\if
construct first and selecting the argument via\@firstoftwo
and\@secondoftwo
:The
\expandafter
closes the current if branch first. The macros\@firstoftwo
and\@secondoftwo
are defined in LaTeX:Plain TeX does not have an equivalent, thus they need to be defined there.