[Tex/LaTex] Handling of special LaTeX characters in text

characters

I have a LaTeX file for a book chapter, which may need to be converted to Word at some point, because that is what the publishers use (sigh). This article doesn't have any math in it, but I'm still using LaTeX, because even when not writing mathematics, it is better than the alternatives. Unfortunately, I'm making heavy use of characters that are only supposed to appear inside a math environment, notably the underscore character (_). I do use the math environment in a couple of places, but only very briefly.

So, my question is, can I tell LaTeX to make an exception for this specific character, and actually treat the underscore as a underscore? This would make later conversion to Word or some other word processing format easier. The alternative is to use \_ in lots of places. Bonus points if there is some way to still use underscore inside a math environment, not sure how that would work. To be clear, I'd like a general technique that would work with any of the special LaTex characters, though I suppose remapping most of the other ones would cause more trouble than it was worth.

I'm not sure what tags to use here, so please feel free to add. Thanks.

EDIT: It turns out this is a FAQ – How to typeset an underscore character.

Best Answer

For the underscore it's quite easy:

\documentclass{article}
\usepackage[T1]{fontenc}

\catcode`_=12
\begingroup\lccode`~=`_\lowercase{\endgroup\let~\sb}
\mathcode`_="8000

\begin{document}
Under_score but $a_{x}$
\end{document}

Actually the line \mathcode`_="8000 is redundant, but repeating it makes our intentions clear.

We make the character _ is "math active", i.e., it behaves like a macro, but only in math mode. The \begingroup\lccode... trick defines this macro to be equivalent to \sb which in turn is equivalent to the usual _ for introducing a subscript. In order that it's really seen as a math active character, we need to give it category code 12, which also makes it printable (outside math mode). However, we need a font that has an underscore in the right position, so we load the T1 output encoding.

Other special characters have to be treated in different ways. For example, the $ symbol can be "neutralized" by saying

\usepackage{fixltx2e}
\catcode`$=12

in the preamble; the package is necessary because it "robustifies" the \( and \) commands. In-line formulas must now be input with these commands, of course.

For the &, one can say

\def\AM{&}
\catcode`&=12

and use \AM for marking alignment points in tabular environments.

Also the # character can be neutralized, as long as after saying

\catcode`#=12

one doesn't try defining new commands.

However, I don't recommend to change catcodes (other than the underscore, perhaps). A "search and replace", in case of a conversion to other formats, is safer.