[Tex/LaTex] Is it possible to automatically enumerate sentences in LaTeX

environmentsnumberingsectioningsections-paragraphs

1.0.1 Is it possible to automatically enumerate sentences in LaTeX?

2.0.1 I want to thoroughly work through and criticize a longer text. 2.0.2 To facilitate quoting, I would like to typeset the text and enumerate the sentences in this way. 2.0.3 Some time ago, I saw this done in a complex game ruleset.

2.1.1 In the old ruleset, the enumeration was typeset in subscript. 2.1.2 The sentence enumeration was a continuation of the chapter enumeration. 2.1.3 In all aspects except adding the sentence number, the text should be typeset in the normal way. 2.1.4 The numbering should be added automagically without the use of a \beginsentence macro.

3.0.1 In case this works at all, is it possible to additionally reference sentences by their numbers [labels], and have the label automatically updated in case the numbering in the text changes? 3.0.2 Of course, 2.1.4 implies something along the lines of \begin{sentencenumbering} or so.

Best Answer

As other have pointed out, doing it completely automatically is probably quite difficult. If you want to use the \label-\ref mechanism, you would have to insert labels anyway. Let's pick some character which is usually not used in input, such as the vertical bar |. Ten minutes of hacking and we end up with:

\documentclass{article}

\newcounter{sentence}
\newcounter{para}

\makeatletter
\@addtoreset{sentence}{para}
\@addtoreset{para}{section}

\catcode`\|=\active
\def|{\@ifnextchar[%] to keep my editor happy
  \start@label\start@nolabel}

\def\start@label[#1]{\ifvmode \start@para@label[#1]\else \start@sent@label[#1]\fi}
\def\start@nolabel{\ifvmode \start@para@nolabel\else \start@sent@nolabel\fi}

\def\start@para@label[#1]{%
  \refstepcounter{para}%
  \label{#1}\leavevmode}

\def\start@sent@label[#1]{%
  \refstepcounter{sentence}%
  \label{#1}%
  \thesentence~}

\def\start@para@nolabel{%
  \stepcounter{para}\leavevmode}

\def\start@sent@nolabel{%
  \stepcounter{sentence}%
  \thesentence~}

\makeatother

\renewcommand{\thepara}{\thesection.\arabic{para}}
\renewcommand{\thesentence}{\thepara.\arabic{sentence}}

\begin{document}

\parindent=0pt
\parskip=1em

||These rules must be followed. |The end of a paragraph is indicated
as usual with a blank line.

||[parstart]A new paragraph must start with a vertical
bar. |[sentstart]Each sentence must also start with a vertical
bar. |It follows from~\ref{parstart} and~\ref{sentstart} that a new
paragraph actually starts with two vertical bars.

Without vertical bars, nothing special happens. This might be useful
to comment on the formal rules above or below.

|[parref]|Each vertical bar takes an optional argument. |If
given, it is used as a label. |[sentref]For example, this is
sentence~\ref{sentref} of paragraph~\ref{parref}.

\end{document}

Result

I use the fact that after a \par we're in vertical mode to distinguish the two uses of |. But then we need to explicitly \leavevmode, since the beginning of a paragraph itself does not insert material causing us to switch to horizontal mode. If you want to be able to reference both whole paragraphs and single sentences, we need the double || at the start of each paragraph (they might both have an optional argument). If you never need to reference whole paragraphs, it's easy to change the syntax so that only a single | is needed at the start of a paragraph; in fact, it would be much simpler, since | could be implemented as a single macro with optional argument (which, when in \ifvmode, steps the paragraph counter before doing the other stuff).

Added I avoided using \everypar since it's not very reliable if lots of other things are happening. However, wrapping stuff in an environment might allow one to use \everypar and provide a simpler syntax. The biggest problem is really to allow the use of labels; we have to tell LaTeX when and how to look for a label.