[Tex/LaTex] A design question: citation commands

bibtexciting

I don’t know if this is the forum for fundamental design questions, but I have been wondering for some time about the evolution of \cite commands. With natbib in authoryear format, \citep is necessary to get a citation like [Jones 2010]. If you use plain \cite you get Jones [2010]. With newer packages you have yet other variations like \parencite.

What seems odd to me is making the default referencing format variable. Why does \cite not give you what you normally want and if you want to make the author name exit the citation notation, you do that with a special notation that can work if you switch to numeric format? If you use a numeric format, \cite gives you a plain citation, so if you were using an authoryear format and change it, something like “according to Jones [2010]” turns into “according to [42]”, which does not read well.

For example, how about this?

  • \cite{key} – default style format ([Jones 2010], [42] etc.)
  • \citename{key} – author name(s) (e.g. Jones – with style-defined options like when to use et al.)
  • \citeyear{key} – year
  • \citename[cite]{key} – author name(s) plus whatever else you need to complete the citation – e.g., Jones [2010] or Jones [42].

The way it currently works is troubling because switching styles etc. should work without having to respell your citation markup. And the simplest notation can switch from right to wrong.

Best Answer

Design question and especially why questions are always a bit tricky to answer unless you are the developer or there is extensive documentation about the decision process during development. Without that kind of insider information the answer can often come down to: Because that's how it is.

At least in standard LaTeX and with BibTeX citations and the bibliography are intricately related, so this discussion will feature both.

LaTeX kernel

In the standard LaTeX classes the bibliography environment thebibliography is basically a slightly extended version of the enumerate environment that uses \bibitem instead of \item. \cite is more or less a \ref to a label generated by \bibitem (unlike the usual \ref \cite can accept several comma-separated keys and has an optional postnote argument, additionally all \cites are logged in the .aux file, so the actual implementation is much more complex than \ref, but the idea is the same).

This setup is very natural for numeric citations of the simple form

[1] and [1, 2]

as they are common in many STEM areas.

The optional argument of \bibitem can be used to override the automatically generated numeric label, which is convenient for alphabetic labels of the form

[SR98] and [SR98, Nus78]

although you will have to provide the labels manually.

BibTeX

Your average BibTeX style just automatically generates a thebibliography environment for use with LaTeX. As such it can help you with automatically calculating alphabetic labels for entries. But BibTeX has little business meddling with the implementation of \cite and thebibliography itself. That means that the output of citation largely stays untouched.

In theory it would be possible to use the optional argument of \bibitem to pass complete author-year labels of the form 'Sigfridsson and Ryde 1998' through to LaTeX. Furthermore, thebibliography could be redefined to drop the label in the bibliography and the square brackets could be removed so that \cite could be used to obtain

Sigfridsson and Ryde 1998 and Sigfridsson and Ryde 1998, Nussbaum 1978

This is what apalike.bst and apalike.sty of standard BibTeX do. But that can be unsatisfying since LaTeX doesn't actually know the author and the year of a citation here, it just knows the entire label. It is therefore hard to make a style that prints

Sigfridson and Ryde 1998

at one point output something slightly different like

Sigfridson and Ryde (1998)

at another point.

There are packages like cite to modify the output of the \cite command quite comfortably (cite also does other nice things), but they can't really change the fundamental limitation that there is only one cite label and therefore really only one \cite command.

BibTeX-based extension packages

The limitations of the 'one label' approach made people come up with methods to smuggle some raw information about an entry back to LaTeX. Instead of just telling LaTeX that the label of the entry is 'Sigfridsson and Ryde 1998' the styles tell LaTeX that the author list of that entry is 'Sigfridsson and Ryde' and that the year is '1998'. This idea makes it possible to implement several different \cite like commands with different output.

Packages that do this usually redefine \cite (and possibly thebibliography) and rely on specific .bst styles that use the provided interfaces to pass on author and year information.

One very early package that does this is newapa from 1985 (at least according to the CTAN page, the source code mentions that version 2.0 is from July 1991), which defines the following commands

% The ``newapa.bst'' BibTeX bibliography style creates citations with labels:
%       \citeauthoryear{author-info}{abbrev. author-info}{year}
[...]
% \cite[optional notes]{Key(s)}
%     -> (Authors1, Year1; Authors2, Year2; ..., optional notes)
% \citeA[optional notes]{key}
%     -> Authors (Year, optional notes)
%     Note: ONE AND ONLY ONE KEY.
%           \citeA[pp.~3--5]{Apt88,Lloyd87} does not make sense at all.
%           In this case, the outcome will look aweful.
% \citeB{keys}
%     -> Authors1 (Year1), Authors2 (Year2), ...
%     Note: \citeB[Notes]{keys} are given, notes will be ingored,
%           because it does not make sense at all.
% \citeauthor[optional notes]{key}
%     -> Authors1, Authors2, ..., optional notes
%
% The difference between `\shortciteXXX' and `\citeXXX':
% is that `\shortciteXXX' gives `First author et al.'
% if no. authors >= 3.
[... \shortcite ...]
% \citeyear[optional notes]{key}
%     -> (Year, optional notes)
%

chicago from 1992 is based on newapa and follows a similar approach and defines

% The ``chicago'' BibTeX bibliography style creates citations with labels:
%       \citeauthoryear{author-info}{abbrev. author-info}{year}
%
% These labels are processed by the following LaTeX commands:
%
%  \cite{key}
%    which produces citations with full author list and year.
%    eg. (Brown 1978; Jarke, Turner, Stohl, et al. 1985)
%  \citeNP{key}
%    which produces citations with full author list and year, but without
%    enclosing parentheses:
%    eg. Brown 1978; Jarke, Turner and Stohl 1985
%  \citeA{key}
%    which produces citations with only the full author list.
%    eg. (Brown; Jarke, Turner and Stohl)
%  \citeANP{key}
%    which produces citations with only the full author list, without
%    parentheses eg. Brown; Jarke, Turner and Stohl
%  \citeN{key}
%    which produces citations with the full author list and year, but
%    can be used as nouns in a sentence; no parentheses appear around
%    the author names, but only around the year.
%      eg. Shneiderman (1978) states that......
%    \citeN should only be used for a single citation.
%  \shortcite{key}
%    which produces citations with abbreviated author list and year.
%  \shortciteNP{key}
%    which produces citations with abbreviated author list and year.
%  \shortciteA{key}
%    which produces only the abbreviated author list.
%  \shortciteANP{key}
%    which produces only the abbreviated author list.
%  \shortciteN{key}
%    which produces the abbreviated author list and year, with only the
%    year in parentheses. Use with only one citation.
%  \citeyear{key}
%    which produces the year information only, within parentheses.
%  \citeyearNP{key}
%    which produces the year information only.
%

The harvard package defines (amongst other commands and variants of the below)

  • \citeasnoun for citations of the form

    Sigfridsson and Ryde (1998)

  • \cite for citations of the form

    (Sigfridsson and Ryde 1998)

  • \citeyear

    (1998)

  • \citename

    Sigfridsson and Ryde

Finally natbib defines amongst many other commands (such as \citeyear and \citeauthor) the main commands

  • \citet

    Sigfridsson and Ryde (1998)

  • \citep

    (Sigfridsson and Ryde, 1998)

natbib also explicitly deprecates \cite

Both \citepand \citetare defined by natbib and are thus not standard. The standard LaTeX command \cite should be avoided, because it behaves like \citet for author–year citations, but like \citep for numerical ones.

Note that unlike the packages mentioned before natbib supports both author-year and numeric citations.

natbib.dtx contains very interesting comments about the history of \cite, \citet and \citep, which I quote here in full

% \begin{macro}{\citet}
% \changes{6.2}{1996 Jan 11}{Add macro for textual cite in numerical mode}
% \changes{6.5}{1997 Jan 30}{Make this standard textual citation, with notes}
% \changes{6.6}{1997 May 26}{Define with \cs{ifNAT@par} on}
% \changes{6.6}{1997 May 27}{Redefine for new coding of \cs{@citex}}
% \changes{6.8}{1998 Feb 19}{Place inside a group}
% \changes{8.2}{2008 Jul 01}{(AO) Assign \cs{NAT@ctype} with \cs{let} \cs{z@}: allow for tighter \cs{ifnum} comparisons.}
% Textual citations are made with the |\citet| command, which has a starred
% form for full authors, and may have one or two optional arguments for notes.
%
% This command is equivalent to the older syntex |\cite|\marg{key} without
% optional arguments, at least in author--year mode. In numerical mode,
% it prints the authors and then the numerical reference, whereas |\cite|
% will only print the number. This simplifies the switching between the two
% modes with minimal changes in text. One wants to replace ``as shown by Jones
% et~al.\ (1990)'' with ``as shown by Jones et~al.\ [21]''; with |\cite|,
% one would get ``as shown by [21]''.
%
% This is such a practical idea, that I have decided to make it the standard
% means of textual citing. It now even allows optional notes in textual cites,
% something that I have been asked to provide, and which is impossible under
% the old syntex with |\cite|.
%
% Note that the flag |\ifNAT@swa| determines if the citation is textual or
% parenthetical, and |\ifNAT@full| whether abbreviated or full author names
% are to be printed.
% The flag |\ifNAT@par| enables the opening and closing parentheses; this is
% used only to suppress these for |\citealt| below.
%
% The flag |\NAT@ctype| controls what |\@citex| actually outputs: author and
% year (0), author only (1), year only (2). (For numerical mode, 0 yields
% author [number].)
%
% All the citation commands are placed inside a group; they start
% with |\begingroup|, with the corresponding |\endgroup| added to
% the |\@cite| command. This localizes the flag settings so that cites may
% be contained with cites, as
% \begin{quote}
%    |\citep[cited in \citealp{xx}]{yy}|
% \end{quote}
% (Without the localization, the |\citealp| turns off the brackets before
%   |\citep| comes to an end.)
%    \begin{macrocode}
\newif\ifNAT@full\NAT@fullfalse
\newif\ifNAT@swa
\DeclareRobustCommand\citet
   {\begingroup\NAT@swafalse\let\NAT@ctype\z@\NAT@partrue
     \@ifstar{\NAT@fulltrue\NAT@citetp}{\NAT@fullfalse\NAT@citetp}}
\newcommand\NAT@citetp{\@ifnextchar[{\NAT@@citetp}{\NAT@@citetp[]}}
\newcommand\NAT@@citetp{}
\def\NAT@@citetp[#1]{\@ifnextchar[{\@citex[#1]}{\@citex[][#1]}}
%    \end{macrocode}
% \end{macro}
%
% \begin{macro}{\citep}
% \changes{5.5}{1995 Mar 27}{Add macro as shorthand for \cs{cite[]}}
% \changes{6.5}{1997 Jan 30}{Make this standard parenthetical citation,
%        with notes}
% \changes{6.6}{1997 May 26}{Define with \cs{ifNAT@par} on}
% \changes{6.6}{1997 May 27}{Redefine for new coding of \cs{@citex}}
% \changes{6.8}{1998 Feb 19}{Place inside a group}
% \changes{8.2}{2008 Jul 01}{(AO) Assign \cs{NAT@ctype} with \cs{let} \cs{z@}: allow for tighter \cs{ifnum} comparisons.}
% Parenthetical citations are made with the |\citep| command, which has a
% starred form for full authors, and may have one or two optional arguments for
% notes.
%
% This command was originally added by special request to be a simplification
% for |\cite[]|\marg{key}, and I added it grudgingly. After I reprogrammed
% \thestyle\ to handle author--year and numerical citations with the same
% \texttt{.bst} file, I needed to add the |\citet| command, and this one
% became an obvious companion for parenthetical citations. I have now altered
% its coding to be the standard, being fully equivalent in functionality
% to the older syntax |\cite|\oarg{pre}\oarg{post}\marg{key}.
%    \begin{macrocode}
\DeclareRobustCommand\citep
   {\begingroup\NAT@swatrue\let\NAT@ctype\z@\NAT@partrue
         \@ifstar{\NAT@fulltrue\NAT@citetp}{\NAT@fullfalse\NAT@citetp}}
%    \end{macrocode}
% \end{macro}
%
% \begin{macro}{\cite}
% \changes{5.0}{1994 May 18}{Add a second optional argument.}
% \changes{5.3}{1994 Sep 19}{Add starred version to print full author list.}
% \changes{5.4}{1994 Nov 24}{Replace \cmd{\if@tempswa} with \cmd{\ifNAT@swa}.}
% \changes{6.5}{1997 Jan 30}{Declare obsolete, retained for compatibility}
% \changes{6.6}{1997 May 27}{Redefine for new coding of \cs{@citex}}
% \changes{6.8}{1998 Feb 19}{Place inside a group}
% \changes{8.2}{2008 Jul 01}{(AO) Assign \cs{NAT@ctype} with \cs{let} \cs{z@}: allow for tighter \cs{ifnum} comparisons.}
% The |\cite| command was originally used to produce both textual and
% parenthetical citations; with an option argument, even an empty one,
% the citation was parenthical. I now recommend the use of |\citet| and
% |\citep| instead, retaining |\cite| only for compatibility.
%
% In numerical mode, it must be parenthetical (|ifNAT@swa| \meta{true}) both
% with and without notes, so that it emulates the standard \LaTeX\ usage. In
% author--year mode, this flag depends on presence or absence of notes.
%    \begin{macrocode}
\DeclareRobustCommand\cite
    {\begingroup\let\NAT@ctype\z@\NAT@partrue\NAT@swatrue
      \@ifstar{\NAT@fulltrue\NAT@cites}{\NAT@fullfalse\NAT@cites}}
\newcommand\NAT@cites{\@ifnextchar [{\NAT@@citetp}{%
     \ifNAT@numbers\else
     \NAT@swafalse
     \fi
    \NAT@@citetp[]}}
%    \end{macrocode}
% \end{macro}

Originally natbib did not have \citep and \citet and the behaviour of \cite for author-year citations would change depending on the presence of the optional (page number/post note) argument. The introduction of \citet and \citep then meant that there was a consistent interface that would remain stable even if the style was changed from author-year to numeric, but the behaviour of \cite was kept the same for backwards compatibility. As to why \cite originally behaved the way it did I can't offer more than speculation: It is the obvious choice to try and redefine \cite, because people know it. The numeric behaviour was very likely kept in line with standard LaTeX for compatibility reasons (and because it made sense). The author-year behaviour was probably born out of a desire not to introduce different commands for '(Nussbaum, 1978)' and 'Nussbaum (1978)'. If an optional argument is present the citation refers to a particular point in the source in which case the '(Nussbaum, 1978, p. 30)' form is often more appropriate than 'Nussbaum (1978, p. 30)'. If the argument is not present that is an easy marker to go for the lighter textual citation 'Nussbaum (1978)'.

Another citation package that should be mentioned when we talk about the evolution of bibliography styles is jurabib, especially because I see it as the link between traditional BibTeX .bst styles and biblatex. But it is fairly uninteresting when it comes to \cite. The package defines \cite to give citations without parentheses and \footcite for citations in footnotes. Additionally, it has emulations for natbib's \citep and \citet. Its \citefield command was probably a new feature.

biblatex

biblatex came in much later than the packages mentioned before and completely reimplements citation and bibliography handling. biblatex also comes with yet another set of citation commands, amongst others

  • \cite

    Sigfridsson and Ryde 1998 // [2]

  • \parencite

    (Sigfridsson and Ryde 1998) // [2]

  • \textcite

    Sigfridsson and Ryde (1998) // Sigfridsson and Ryde [2]

It also defines \autocite (which I highly recommend) to make it easier to switch between citation styles.


So just to recap: Standard LaTeX only has \cite and that \cite really only does numeric and alphabetic citations. There are quite some extra packages that enhance LaTeX's citation and bibliography interface and each of those packages defines a slightly different set of commands that behave slightly different.

Each of the package authors will have had their own reasons for doing the things the way they did, from their idea of convenience over aesthetic considerations, to practical constraints and backwards compatibility...

The answer to the question "Why does \cite not give you what you normally want?" is that it is not at all clear from the start what one would normally want. Some people may only want "[2]" when they cite, some may prefer "Sigfridsson and Ryde [2]".

You may also be interested in Universal `\cite` commands or defining new cite commands. But the answer to your problem might just be to follow the explicit advice in the natbib documentation and stop using \cite: Use \citet or \citep.

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{natbib}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@article{sigfridsson,
  author  = {Sigfridsson, Emma and Ryde, Ulf},
  title   = {Comparison of methods for deriving atomic charges from the
             electrostatic potential and moments},
  journal = {Journal of Computational Chemistry},
  year    = 1998,
  volume  = 19,
  number  = 4,
  pages   = {377-395},
  doi     = {10.1002/(SICI)1096-987X(199803)19:4<377::AID-JCC1>3.0.CO;2-P},
}
\end{filecontents}

\begin{document}
\cite{sigfridsson}

\cite[380]{sigfridsson}

\citep{sigfridsson}

\citep[380]{sigfridsson}

\citet{sigfridsson}

\citet[380]{sigfridsson}

\bibliographystyle{plainnat}
\bibliography{\jobname}
\end{document}

Sigfridsson and Ryde [1998]//[Sigfridsson and Ryde, 1998, 380]//[Sigfridsson and Ryde, 1998]//[Sigfridsson and Ryde, 1998, 380]//Sigfridsson and Ryde [1998]//Sigfridsson and Ryde [1998, 380]