Texcount and latexdiff

latexdifftexcount

I am wondering if there is a way to combine latexdiff and texcount, to count the number of words that have changed between two docs?

This is not the same as the number of words before and after, but to count words changed/added/deleted.

So if a document had 100 words before, and still has 100 words, but 50 words were changed, then the output should be 50. (I cannot really think of the best way to do this. If 50 words were changed, this would show up as 50 words deleted and 50 words added. Should the count be 50? or 100?)

Since latexdiff marks up added words and deleted words, could simply count what is in this markup, perhaps.

Best Answer

I'm not very familiar with latexdiff, but it seems at least some of the annotation of differences is done using \DIFadd{...} and \DIFdel{...} to indicate added and deleted text.

If that is the case, TeXcount can count these by adding macro handling rules for these two macros. One method is to include the following instructions for TeXcount in the document:

%TC:newcounter add Added
%TC:newcounter del Deleted
%TC:macro \DIFadd [add]
%TC:macro \DIFdel [del]

What these TeX comments do is provide instructions to TeXcount (%TC:...) which define two new counters, and then define macro handling rules for \DIFadd and \DIFdel which each take one argument to be counted using these new counters.

These lines need to be included in the difference file, but if you add them to the tex file they should pass through the diffing. An option is to put them in a separate file which you include, but then you need to run TeXcount with the -merge option to insert the included file.

TeXcount has an -opt option which could have been used, but unfortunately this does not work as intended: eg it does not handle the newcounter instruction.

NB: Beware that words counted as added or deleted in this manner are not included in the other word counters. It is possible to add counters together in the summary output when using templates, eg to get total number of words before and after.

Related Question