[Tex/LaTex] Table mapping mathematical and scientific symbols to their meanings

symbols

Scott Pakin's Comprehensive LaTeX Symbol List (henceforth "Comprehensive List") and Detexify are great resources for finding symbols' LaTeX macros and packages.

There is however a related thing that I feel is sorely needed: a table, in whatever form, that maps the most common symbols to their meanings.

To elaborate on some issues relevant for a meaningful interpretation of this request:

  • Which symbols should be documented? Only symbols with accepted usage in established fields should appear. Yes, determining the boundaries is difficult, but this doesn't mean that an effort isn't useful.
  • Which symbols from the Comprehensive List are actually used? An answer to this question could be approximated by counting macro usage within published articles. Frequency information could be informative as well.
  • Why is the Comprehensive List so large? Because a lot of symbols in the Comprehensive List seemed to have been generated through a process involving creative rage and obsessive completion based on symmetric considerations. I'm not saying these are bad per se, but it would be nice to know whether esoteric symbol #253 from the Comprehensive List is actually used with consistency anywhere in the math literature.
  • What are practical uses of such a table?

    • Such a table will be useful for math symbol font designers.
    • Unicode might be interested in knowing which symbols still need to be encoded.
    • It is bothersome that the Comprehensive List as well as the interfaces of various equation editors inundate their users with a bunch of graphical symbols the majority of which one won't ever need, with some crucial ones being hard to find or even missing. One could prioritize the symbols in consistent, actual use by presenting them first, giving them shorter macro names, etc. Bringing a bit of order into the mess will be a good thing.
    • Presenting symbols in an organized way will make it easier for package writers and users to organize and select symbols in a way that minimizes package conflicts. The incompatibility of MnSymbol with amssymb and amsfonts is notorious.
    • Indexes (one for each field) for symbols will also make life easier for mathematicians and scientists wanting to publish in a particular field.

Are there such tables? Is the community interested in constructing such tables?

Best Answer

the concept of tables (not just one, surely!) mapping math and scientific symbols to their meanings in particular fields is undoubtedly worthy, and in fact, the question has been asked before, but, as far as i am aware, nothing organized has ever been done about it.

in fact, accomplishing such a feat may be next to impossible. or, as some competent mathematicians have informed me, it may be a useless exercise. why? first, because so many symbols are used, often with different meanings, in different areas. also, a mathematician can define his/her own notation, and if there's not already a well established symbol for a concept (which would be known to a mathematician experienced in the area), a new one will often be selected based on its shape relative to that of symbols already used for related concepts, regardless of the new symbol's meaning in other areas.

so, such tables would be of use mostly to newcomers in the field, mainly graduate students, and few established mathematicians have the incentive or interest in doing what for them would probably be a rather menial job. and the students are usually too busy working on their research, which will lead to the formal recognition of a degree, while working on very useful tex-related projects gains nothing more than appreciation. (more than one degree has foundered on such a shoal.)

it may be instructive to consider how some existing symbol collections were compiled. the basic cmsy and cmex fonts provide the symbols that knuth needed for the art of computer programming. that's a basic computer science collection. the additional amssymb collection (msa and msb fonts) were based on what had been used or required for ams publications prepared by earlier means, including the symbols collection provided by the science typographers software augmented with items from the monotype symbols lists -- none of them identified by anything but an access code meaningless except within the context of that composition system. the control sequence names assigned to the "ams symbols" were usually just the names used by proofreaders, who were often not even mathematicians, and certainly not area specialists.

the stix collection, which was the basis for the massive increase of technical symbols in unicode 3 and 4, started with the cm and ms fonts (no name changes), the symbols component of sgml entity sets (iso tr 9573-13), "needed" lists compiled by the stipub organizations (ams, acs, aip, aps, ieee, siam, and elsevier), and some additional contributions from wolfram and design sciences. but again, no area identification was included.

compiling area-specific lists requires specialist knowledge, i.e., people. let's leave that aside for the moment.

what approach(es) might be considered for compiling lists by frequency? only one comes immediately to mind: from a corpus of (la)tex publications, count the occurrences of all control sequences, ignoring those that are clearly not symbols (\chapter, \section, \begin, \end, etc.).

but there are problems. authors often define their own macros for either individual symbols or preformed strings of symbols, so the uses of the symbols themselves in the body can't easily be counted. even worse, many authors, over time, compile great collections of macros that they've used before and might use again, and simply dump those into either the preamble or a separate .sty file without any "weeding", so it's not easy to tell whether a particular macro (and thus symbol) is actually used in a job.

it is possible to filter for a few specific symbols by macro name; i've done it to document usage as required for acceptance into unicode, but it's non-trivial. what's necessary is an automated procedure.

here are steps that might be applied to article files:

  • expand all macros in the body, so that the definitions aren't needed any more;
  • remove all definitions from job;
  • tex to make sure nothing is lost;
  • confirm that the output is the same as the original.

undertaking such a task would need a good programmer and a really dedicated tester and/or support from some organization (probably not publishers, other than for access to files, since they're already hard pressed).

now, how to identify the areas in which these symbols are used? even if the area of an article is well defined (say by a subject classification as defined for mathscinet), it's not clear that it's a good example of symbol usage, so any automatic compilation of symbols would still require manual checking.

i think the only reasonable prospect for creating area-specific lists would be action by knowledgeable humans. any volunteers?

Related Question