[Tex/LaTex] LaTeX word count for dummies!

word count

I think this question has been here a lot of times. And yes, I have googled it. I've found a lot of different answers. Most say "use another tool to count them". But some have actually scripted something to use directly in LaTeX. Sadly I am really stupid, so I don't know how to use those tools. According to my professor I should only count the title and the text under the title: everything like introducing myself, footnotes and references shouldn't be count!

Here my source code as far:

\documentclass[12pt, a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[left=2cm,right=5cm,top=3cm,bottom=2cm]{geometry}

\begin{document}

counted words: XXXXX

\end{document}

One solution I have found is from phg, answered May 6 '12 at 13:15. He said something about using Context.

\setwordthreshold{3} %%% min chars in a row to count as word
\startwordcount      %%% start callback
\input knuth\par     %%% counted
\currentwordcount    %%% => 94 with threshold == 3
\input knuth         %%% counted
\stopwordcount       %%% deregister callback
\input knuth         %%% not counted
\dumpwordcount       %%% => 188

Fran, answered Jun 4 '13 at 1:01, said something about texcount.

% CAUTION !!!
% 1) Need --enable-write18 or --shell-escape 
% 2) This file MUST be saved 
%    as "borra.tex" before the compilation
%    in your working directory
% 3) This code will write wordcount.tex
%    and charcount.tex in /tmp of your disk.
%    (Windows users must change this path)
% 4) Do not compile if you are unsure
%    of what you are doing.

\documentclass{article}
\usepackage{moreverb} % for verbatim ouput

% Count of words

\immediate\write18{texcount -inc -incbib 
-sum borra.tex > /tmp/wordcount.tex}
\newcommand\wordcount{
\verbatiminput{/tmp/wordcount.tex}}

% Count of characters

\immediate\write18{texcount -char -freq
 borra.tex > /tmp/charcount.tex}
\newcommand\charcount{
\verbatiminput{/tmp/charcount.tex}}


\begin{document}


\section{Section: text example with a float}

Words and characters of this example file are 
automatically counted from the source file 
when compiled (therefore generated text as 
\textbackslash{}lipsum[1-10] is {\bfseries not} 
counted). The results are showed at the end 
of the compiled version.
Counts are made in headers, caption floats 
and normal text for the whole file. Subcounts 
for structured parts (sections, subsections, 
etc.) are also made. Number of headers, 
floats and math chunks are also counted. 

\begin{figure}[h]
\centering
\framebox{This is only a example float} 
\caption{This is a example caption}
\end{figure}

\subsection{Subsection: Little text with math chunks}

In line math: $\pi +2 = 2+\pi$ \\   
Display math: \[\pi +2 = 2+\pi\] 

%TC:ignore  
\dotfill End of the example \dotfill 

\subsubsection*{Counts of words} 
\wordcount

%TC:endignore   

\end{document}

Last but not least, Loop Space, answered Jul 29 '10 at 8:31, has wrote a Perl script:

#!/usr/bin/perl -w

@ARGV and $ARGV[0] =~ /^-+h(elp)?$/ && die "Usage:\t$0 files\n\t$0 < files\n\t$0\n";

my $count = 0;
my $first = "";
my $tex = 0;

while ($first =~ /^\s*$/) {
    $first = <>;
}

if ($first =~ /^\\(input|section|setlength|documentstyle|chapter|documentclass|relax|contentsline|indexentry|begin|glossaryentry)/) {
    $tex = sub { $r = $_[0];
                 $m = $_[1];
                 $r =~ s/\\(emph|textbf|textit|texttt|em)\{//g;
                 $r =~ s/\\(sub)*section\*?\{[^\}]*\}//;
                 $r =~ s/\\title\{[^\}]*\}//;
                 $r =~ s/\\\(.*?\\\)/maths/g;
                 $r =~ s/\\\(.*?$/maths/;
                 $r =~ s/^.*?\\\)/maths/;
                 $r =~ s/\\\[.*?\\\]/maths/g;
                 $r =~ s/.*?\\\]// and $m = 0;
                 $m and $r = "";
                 $r =~ s/\\\[.*?$// and $m = 1;
                 $r =~ s/\\\S*//g;
                 $r =~ s/%.*//;
                 return ($r,$m) };
} else {
    $tex = sub { return ($_[0],0) };
    @split = split(" ", $first);
    $count += $#split + 1;
}

while ($s = <>) {
    ($t,$n) = &$tex($s,$n);
    @split = split(" ", $t);
    $count += $#split + 1;
}

print "Number of words: $count\n";

Or maybe it is possible to use a word count tool, programmed in Python? (I know a little, a very little, Python myself).

I don't know which solution is the best one! I don't know how to use them. The only thing I know is, that the professor wants me to write how many words are used (title+body text only). And with words he thought of words in the PDF file, and not every LaTeXy word.

Hopefully you are able to help someone as stupid as me. If you can end the source code I started on, that would be a great help! Thank you so much in advance! Yours faithfully 😉

Best Answer

I'm not convinced that any software tool can do this. The typical, and traditional, hand method is to print the document, count the number of words on a typical page and multiply by the number of pages of interest (making due allowance for any illustrations or tables). As far as I am aware nobody, except the compulsive/obssesive, actually counts every word in a document of any length.