Is there a way to make a list of all the words that are being used in a Latex document? Alternatively, if someone knows another way to do it that could also be helpful, e.g. by using Python, a website, or something else
Here is an example of what I would like:
\documentclass{article}
\begin{document}
I have a dog and a cat.
The dog and the cat are named Bob and John.
\end{document} % Should maybe be after the list
list:
I
have
a
dog
and
cat
the
are
named
bob
john
The order of the words in the list does not matter.
And thank you if you can help.
Best Answer
For some definition of "word" and "being used" you can extract the text from the PDF and process to a list.
will produce
file1.txt
Which you can process with (standard linux utilities that would also be available on windows if needed, actually I am using cygwin versions on windows)
Then
Produces the list:
The long command pipe is doing at each step: