I have ~25,000 PDF files that I want to classify based on the presence of keywords in their text. I know there's a PDF Toolbox that provides MATLAB with an interface for reading PDF text, but the fact that it comes from Sourceforge makes it difficult to obtain (this is for work) and the reliance on java seems to me like it would make the process very slow -especially for searching so many files. Is there a simpler, faster way to parse these documents if all I want to do is basically strfind on the text to check for keywords?
MATLAB: Quickly Search Strings inside PDF files
optimizationpdfsearchstrfindText Analytics Toolboxwile e. coyote
Best Answer