[Tex/LaTex] Convert all all-caps words to small caps

lualuatextext manipulationxesearch

I'd like to turn all all-uppercase words, acronyms etc. into small caps, preferably automatically on the fly and without changing the source files. The ideal solution would also allow me to block some of these substitutions out with a blacklist and/or individual mark-up.

In my understanding this cannot be done with LaTeX alone.

Not knowing Lua, I tried to adapt the solution in Macro: Replace all occurrences of a word with something along the lines of http://lua-users.org/wiki/FrontierPattern which I can get to match the right words. But as the replacement applies to the whole tex-file, including keywords and commands, the typesetting is broken.

Is there a way to apply Lua's string-replacement only to the input-text and nothing else?

As an alternative I had a look at the xesearch-package, which processes the right part of the input only. But its "(very blunt form of) regular expressions" doesn't seem to allow to search for all-caps words, as far as I can see.

Best Answer

This uses just pdflatex.

Answer has been EDITED to screen out (catcode 12) "punctuation" in discerning if a word is "all caps".

REEDITED to take multi-paragraph arguments. EDITED to work properly when paragraph ends on a \] environment. EDITED to enhance scope containment features as follows:

  1. Use of {...} will provide scope containment, but any CAPS word in the group will not be made into small caps;

  2. Use of \bgroup...\egroup{} will provide scope containment, and any CAPS word in the group will be made into small caps.

This fix to scope containment for {...} was accomplished by adding a \junkchar at the beginning of every word, and then stripping it out prior to typesetting.

What is good about the current solution is that macros and inline math do not disturb the algorithm.

However, there is still (at least) one issue: as with all macros, \verb cannot be part of the argument.

\documentclass{article}
\usepackage{stringstrings,ifthen}
\makeatletter\let\Gobble\@gobble\makeatother
\def\junkchar{+}% MUST BE ANY catcode 11 OR 12 character
% TESTS IF WORD IS ALL CAPITAL LETTERS (catcode 12 PUNCTUATION IS SCREENED)
\def\testcaps#1{%
  \def\capword{T}\testeleven#1\relax\relax%
  \if T\capword\caseupper[q]{#1}%
    \ifthenelse{\equal{#1}{\thestring}}{}{\def\capword{F}}%
  \fi}
% TESTS IF ALL LETTERS OF WORD ARE \catcode 11 or \catcode 12, THEN \capword STAYS {T}
\def\testeleven#1#2\relax{%
  \ifcat\noexpand#1A\ifx\relax#2\relax\else\testeleven#2\relax\fi\else%
    \ifcat\noexpand#10\ifx\relax#2\relax\else\testeleven#2\relax\fi\else%
      \def\capword{F}%    \fi%
    \fi%
  \fi}
% CONVERTS CAP WORDS OF ARGUMENT INTO SMALL CAPS (\par ALLOWED)
\long\def\sccaps#1{\testcappars#1\par\relax\relax}
% PARSES \sccaps ARGUMENT INTO PARAGRAPHS AND INVOKES \testcapwords ON EACH PARAGRAPH 
\long\def\testcappars#1\par#2\relax{%
    \testcapwords\junkchar#1 \relax\relax%
    \ifx\relax#2\else\unskip\par\testcappars#2\relax\fi}
% PARSES PARAGRAPH INTO WORDS; TESTS EACH WORD; IF ALL CAPS, MAKES IT SMALL CAPS
\def\testcapwords#1 #2\relax{%
    \testcaps{#1}\if T\capword\makesc#1\relax\relax\else\Gobble#1\fi%
    \ifx\relax#2\else\ \expandafter\testcapwords\junkchar#2\relax\fi}
% PRODUCES SMALL CAPS WORD FROM KNOWN UPPERCASE WORD.
\def\makesc#1#2#3\relax{\textsc{#2\caselower{#3}}}% #1 IS STRIPPED \junkchar
\begin{document}
\sccaps{% CAN'T START ARGUMENT WITH A SPACE
This is a test of the EMERGENCY BROADCAST SYSTEM.
This is ONLY a test.  The EBS may be consulted for further \textbf{information}.
What about paragraphs that CONTAIN $Ax^2 + Bx + C$ math data?  
\[
y = Ax^2 + Bx + C
\]
or EVEN displaystyle math?

And NOW for a second paragraph at 9:30 PM.

Testing for \bgroup\tiny group LIMITING behavior\egroup{}, but you 
must use bgroup and egroup as the containment if you want CAPS in the group
to be made small caps.

Using braces, {\tiny group LIMITING behavior}, provides group
containment, but CAP words in the group, like LIMITING, are not small-capped.
}
\end{document}

enter image description here

Related Question