[Tex/LaTex] How to use mathpix (a LaTeX OCR tool) to identify LaTeX from images

ocrpython

I'd just heard about mathpix, a way that can identify formula from images and generate the LaTeX code. I have some handouts (already printed) from my teacher 10 years ago written in Chinese and many math formulas. I don't have the original digital file, but only have those documents on my shelves. I want to turn it into digital files, namely get the Chinese texts and math formula in LaTeX code so that I can reproduce and reprint it. However doing it by hand is a heavy work, so I want to seek some clever way. I think mathpix can help me a lot. But I have two main questions with respect to it:

  • If I have a picture like this, with many inline math: (just a demo, not the actual document I have)enter image description here

    Can I get the result both with the English words and LaTeX inline math? (I mean,
    get the resulting string "Suppose $A$ is bounded subset of $\Bbb R^n$. If ...") It seems that I need a pure text OCR tool and mathpix work together nicely. How to achieve such task?

  • If I have bunches of images to identify, I guess I need to write some python program with the mathpix API provided in mathpix API. But the sample code given is not work in my python 3 now. I'm not good at python, how to modify it? Or is there other clever way to do? (Maybe I should ask this question in another board, but I think it would be fewer people know LaTeX there.)

Best Answer

If you only have the hard copy version, you could also try using the Mathpix Android or iOS apps to take a picture of the documents and it will render the LaTeX. You can then export the LaTeX. Try it out and see if that works any better for you!

Related Question