[Tex/LaTex] How to find characters that LaTeX doesn’t like

unicode

I use exported .bib entries with biber, biblatex and \usepackage[utf8]{inputenc}, in my literature reports for my group. This means most standard annoying characters (à, etc) are handled automatically. However, there are a lot that are not. Also, it prints gibberish on the command line.

\u8:�

Is the closest I can get (via pdflatex file.tex >demo.txt), thought it gives actual gibberish on the command line, depending on what the symbol is. It also doesn't say anything about where in the bib file that character is, so I have to try and use several runs of kill here and run fully to guess the entry…

This means that quite often I have to search through my document trying to find the one character that is screwing it up. Often it isn't even a letter, but someone is using a non-ASCII hypen or some such. Is there an easy way to check for non-LaTeX approved characters in a file?

The closest I've found is some mode in emacs that turned non-ASCII characters red, but I forget how I did that, and I still had problems noticing one slightly red hyphen in a 3000 line file. Are there better tools? Or even someone who knows who to turn that mode back on?

Best Answer

I had the same problem in preparing bibliography and I managed to solve it with a text editor Sublime Text. Open the tex file and Ctrl+F, make sure the regular expression (first button) is on and type in [^\x00-\x7F] to find. Special characters are circled.

Example Here