Almost all business data is trapped inside documents. Frequently, however, PDFs are difficult to look up and digitize with difficulty. Mistral promises with a new OCR tool to deliver better performance in this area than previously possible.
French AI company Mistral is best known as a European builder of large language models, but is innovating in other areas as well. With Mistral OCR, it wants to be able to search and digitize pages so that the data becomes usable as markdown. This information can then be leveraged in AI training on proprietary data, something that was not yet realistic for organizations with mountains of PDFs and physical paper trails.
Multimodal approach
As with Mistral’s multimodal LLMs, the company is taking a broad deployment approach. The OCR API can handle text, tables, mathematical formulas and complex layouts. The model is multilingual and can process documents in different languages and fonts. Thus, both English and Arabic should be able to be transformed from handwriting to digital letters.
Performance and availability
According to benchmarks by Mistral AI, the OCR API outperforms comparable solutions via Google Gemini or Microsoft Azure. The model is said to be more accurate in recognizing text and can process up to 2,000 pages per minute on a single node.
The API is available immediately through Mistral’s developer-focused La Plateforme platform and will also be integrated into the Le Chat app. For organizations that process sensitive information, Mistral offers the option of running the technology on its own infrastructure. The cost is about $1 per 1,000 pages, with discounts for batch processing.
Applications
Mistral sees several applications for the OCR API, including digitizing scientific research, preserving historical documents and streamlining customer service. Of course, collecting physical information sometimes remains monk’s work, especially when searching in dusty environments. But the final reading function can be the final stumbling block, with even the human eye sometimes unable to discern the many possible manuscripts.
This presents an opportunity for AI to both simplify this final step of digitization, and get accuracy higher than even humans can achieve. We wait to see the reaction from end users before we can conclude that Mistral OCR truly represents a breakthrough in this area. The benchmarks, available on Mistral’s own blog post, bode well.