Mistral launches Voxtral: open-source speech recognition for businesses

Mistral is launching its new Voxtral speech models, designed to serve as an alternative to closed APIs offered by competitors. The open-source models feature advanced speech recognition, native multilingualism, and extensive context processing for production environments.

Until now, companies had to choose between open-source ASR systems with high error rates and expensive proprietary APIs. Mistral aims to bridge this gap with the new Voxtral models, which combine state-of-the-art accuracy with native semantic understanding for less than half the price of comparable solutions.

Advanced speech functionality

The company has released two variants: a 24B model for production environments and a 3B variant for local and edge deployments. Both versions are available under the Apache 2.0 license, which allows open use.

The models go beyond transcription. They feature a 32k token context length for audio up to 30 minutes for transcription or 40 minutes for understanding analysis. Additionally, they feature built-in question-and-answer functionality and can generate structured summaries on the fly.

These capabilities make the Voxtral models ideal for real interactions and follow-up actions, such as summaries, responses, analysis, and insights, states Mistral. For cost-effective use cases, Voxtral Mini Transcribe delivers.

Multilingual performance

Voxtral automatically recognizes languages and achieves state-of-the-art performance in the widely used languages English, Spanish, French, Portuguese, Hindi, German, Dutch, and Italian. This helps teams serve a global audience with a single system.

In benchmark tests, Voxtral Small consistently outperforms Whisper large-v3 and beats GPT-4o mini Transcribe and Gemini 2.5 Flash in all tasks. In the FLEURS evaluation, it outperforms Whisper in every task and achieves state-of-the-art results in multiple European languages.

The models can also perform function calls directly from speech. This enables the triggering of backend functions, workflows, or API calls based on spoken user intentions, eliminating the need for intermediate processing steps.

Tip: Mistral aims to raise a billion for French AI cloud service

Top story

Cisco CX wants to transform customer experience with AI

CX has become a key pillar of the Cisco story

Sander Almekinders February 12, 2026

Expert Talks

Mistral launches Voxtral: open-source speech recognition for businesses

Advanced speech functionality

Multilingual performance

Stay tuned, subscribe!

Dutch telco refuses to pay ransom, hackers to publish customer data

HPE shakes AI network foundation with Juniper PTX12000 series

VAST Data leverages unique market position to develop full-stack AI infrastructure

How Cisco's AI Canvas is revolutionizing network troubleshooting

NetSuite founder reveals AI transformation 5 years in the making

Why SAP says best-of-breed software era is over

AFX is NetApp's data platform of the future with integrated AI data prep

4 steps to create a future-proof data infrastructure

Secure networking: the foundation for the AI era

Why AI adoption requires a dedicated approach to cyber governance

Professional print materials for European tech events, why booth design still makes the difference

Appdevcon

Webdevcon

Dutch PHP Conference

De IT Afdeling van de toekomst

GITEX ASIA 2026

Southeast Asia AI Application Summit 2026

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices