Best practices for typesetting pseudocode

algpseudocodebest practicespseudocodetypography

I need to typeset the pseudocode of several pieces of code in latex.
I found in this answer a good overview of the three main packages available to render pseudocode.
In the following I will adopt the algpseudocode package that seems to be the most handy and complete one (imho).
However, there are a number of details and best practices that I do not know and I could not find a satisfying answer.
My goal is to achieve the best possible readability of the pseudocode (algorithms can be tricky on their own, no reason to make them even harder).

Consider a single line of pseudocode. All its content (except comments) is usually rendered in math mode (even in the examples from the documentation manual). Is this the right choice from a typographical point of view?
If so, I have an issue with variable names: if I use one letter variables (i, j, k, …) everything is fine, but it gets awful when I need to render variable names with more than one letter (whenever is possible I try to use variable names with only one letter, but sometimes more descriptive names are needed for the sake of clarity).
My current solution is the following:

\algnewcommand{\Var}[1]{\mathit{#1}}

\begin{algorithmic}
    \State $\Var{my\_var} \gets 1$
\end{algorithmic}

However, it is not perfect and I cannot use it outside of math mode to render the name of a variable (so that it is printed with the same style as in the pseudocode).

Choose camelCase or snake_case for variable names? In the case of snake_case, how to correctly render the underscore?
Roman or italic for variable names?
When calling a function with more than one parameter, is the spacing inserted after the comma right?

\begin{algorithmic}
    \State $\Call{MyFunc}{arg1, arg2, arg3}$
\end{algorithmic}

Since each line of code is rendered in math mode one may think that using math symbols is preferred whenever possible… however I'm not so sure this is always the best choice. One such case that often happens in my pseudocode is the following, which one do you think is more suitable? Why?

\algnewcommand{\In}{\mathbf{in}}

\begin{algorithmic}
    \ForAll{$i \in \{1, \ldots, n\}$}
        \State ...
    \EndFor
    \ForAll{$i \In 1, \ldots, n$}
        \State ...
    \EndFor
\end{algorithmic}

Is there a handy why to refer to a block (i.e., many consecutive lines) of code?
Something that provides the same result as the following (possibly requiring less code):

\begin{algorithmic}
    \For{$i = 1,2$} \label{algblock-start:for-loop}
        \State ...
    \EndFor \label{algblock-end:for-loop}
\end{algorithmic}
At lines \ref{algblock-start:for-loop}--\ref{algblock-end:for-loop} ...

How to typeset "special" values like null, true and false? My current solution is to use mathbf:

\algnewcommand{\Null}{\mathbf{null}}
\algnewcommand{\True}{\mathbf{true}}
\algnewcommand{\False}{\mathbf{false}}

[UPDATE] How to typeset the initialisation of common data structures such as arrays, hashmaps, queues and so on? At the moment I do the following:

\begin{algorithmic}
    \State $\Var{my\_array} \gets \Call{Array}{\Null}$
    \State $\Var{my\_map} \gets \Call{HashMap}{\Null}$
    \State $\Var{my\_queue} \gets \Call{Queue}{\Null}$
\end{algorithmic}

Best Answer

There are several ways to make a command like \Var work in and outside of math mode.

\algnewcommand\Var[1]{\mbox{\itshape#1}}
\algnewcommand\Var[1]{\ensuremath{\mathit{#1}}}

Camelcase vs snakecase is a matter of taste, of community, etc. Long variable names are often not very readable in the context of an algorithm. Single letters or words are usually enough. If you really need long variable names, then it is maybe better to use syntax-highlighting of lstlisting or other packages: Write your algorithm in a common programming language, but leave out unnecessary syntactic elements (so it will not compile). E.g., some people use Python-like algorithms, with Python high-lighting.

If you want to use underscores in variable names, you can use something like
```
\algnewcommand\Var[1]{\mbox{\ttfamily\detokenize{#1}}}
... \Var{abc_def} ... $\Var{abc_def}$ ...
```
If it is an algorithm close to math, with short variable names, use italic. Otherwise, if it's more like a program, you could consider something like a typewriter font (see above).
If you can easily separate the parameters, visually, it's fine, otherwise insert as much space as needed to make it readable. Often there is not much space (e.g. when typesetting in two columns), so inserting space everywhere even when not needed may not be the best choice.
People like math mode because it is easy to add sub- and superscripts and typeset mathematical expressions in 2D instead of linearly. If you don't use mathematical expressions, you probably don't need math mode.
Given a particular pseudocode package, one could implement support for referencing blocks, but I'm not aware that it has already been done.
Again, it is a matter of taste or community how to typeset constants like null. Personally, I don't like boldface as it visually sticks out too much. Consider using typewriter font. Or if you are so close to a programming language, consider to use this language as pseudo-language with highlighting.
This looks like Java, so why not write it in pseudo-Java using lstlisting, minting etc.

As a side note, it is unusual when presenting an algorithm to refer to particular implementations like HashMap. Instead, one tries to abstract away from implementation details and stick to tuples, lists and sets (hence the preference for math notation). When discussing things like complexity, you will then assume that these abstract data types with their operations are implemented with an optimal data structure, assuming e.g. hashed keys and constant time for accessing elements.

Best Answer

Related Solutions

[Tex/LaTex] Best practices references

[Tex/LaTex] What are some best practices for the typography of tables and figures

Related Question