Search code examples
ocr

OCR lib for math formulas


I need an open OCR library which is able to scan complex printed math formulas (for example some formulas which were generated via LaTeX). I want to get some LaTeX-like output (or just some AST-like data).

Is there something like this already? Or are current OCR technics just able to parse line-oriented text?

(Note that I also posted this question on Metaoptimize because some people there might have additional knowledge.)

The problem was also described by OpenAI as im2latex.


Solution

  • SESHAT is a open source system written in C++ for recognizing handwritten mathematical expressions. SESHAT was developed as part of a PhD thesis at the PRHLT research center at Universitat Politècnica de València.

    An online demo:http://cat.prhlt.upv.es/mer/

    The source: https://github.com/falvaro/seshat

    Seshat is an open-source system for recognizing handwritten mathematical expressions. Given a sample represented as a sequence of strokes, the parser is able to convert it to LaTeX or other formats like InkML or MathML.