Doctors face tremendous expectations daily. The startup Juvoly aims to let them focus more on patients by utilizing speech recognition AI. However, the company ran into a significant hurdle: existing AI models perform poorly when transcribing non-English medical conversations. With NorthC Datacenters’ support and the latest NVIDIA Blackwell hardware, Juvoly managed to overcome this obstacle. How did they achieve this, and what challenges remain?
We spoke with Juvoly co-founder and CEO Thomas Kluiters to learn more. Before diving deeper into the startup’s broader solution, it’s crucial to understand the origins of Juvoly V2, their advanced speech recognition model. Juvoly V2 was specifically created to address the shortcomings of OpenAI’s Whisper when it comes to documenting Dutch conversations in the medical field. While models like GPT-4 and GPT-4o captivated global attention through ChatGPT, Whisper had been considered the benchmark for speech recognition since its launch in September 2022.
Although OpenAI claimed Whisper supported multiple languages, in reality, its handling of other language —particularly when deployed to transcribe medical terminology—proved inadequate. In this instance, we’re referring to its poor capabilities in Dutch, although Kluiters notes similar conclusions have been drawn by speakers of other languages apart from English. The less well-represented in the training data, the worse Whisper’s outputs become.
Juvoly’s starting point
According to Kluiters, this flaw remained largely unnoticed: “Many assume speech recognition is a solved problem, but that’s simply not true.” While Whisper can perform reasonably well in English medical contexts, it falls short elsewhere. “If you want a reliable Dutch-language model, you need developers who understand the language thoroughly,” he explains, highlighting that subtle errors are often overlooked by less specialized models.
Moreover, benchmarks rarely reflect real-world AI performance accurately. Whisper, with its 3.3 billion parameters, tends to be overly “creative,” often resulting in inaccuracies or “hallucinations,” which are particularly detrimental for precisely recording doctor-patient conversations. Additionally, cloud-based speech recognition services are notoriously costly due to their per-hour pricing.
Juvoly V2 addresses these issues by simplifying the model compared to Whisper and specifically training it on Dutch medical conversations. The result is a solution that’s significantly cheaper, less prone to hallucinations, and about 10% more accurate than Whisper, while being 40 times faster. This speed enables real-time applications. Juvoly V2 also greatly reduces energy consumption, using just 350 Wh per 100 users—compared to the standard 11,000 Wh (11 kWh). A full year’s usage for a Juvoly customer equates to emitting only 200 grams of CO2, similar to driving a gasoline car for two kilometers.
In the next two weeks, Juvoly will launch an upgraded model, Juvoly V3, which promises even better performance, automatic language recognition, and speaker identification.
“Everyone thinks you’re crazy if you build your own speech model,” says Kluiters. “But we did, and it’s paying off significantly now.” Unlike multimodal models like Google’s Gemini, Juvoly ensures data security by keeping patient information within Europe and predominantly within NorthC’s data centers. When using Gemini, sensitive conversations may remain unencrypted in the cloud for up to 55 days for “abuse monitoring” purposes—a significant risk for healthcare providers.
Moving away from the cloud
Juvoly’s ultimate goal is independence from cloud services entirely. The recent acquisition of NVIDIA’s new B200 system—a significant step forward—will soon be inaugurated by Constantijn van Oranje. The “B” represents Blackwell, NVIDIA’s newest GPU architecture. Juvoly currently owns two B200 nodes, aiming for eight by year-end. Each node contains eight GPUs, meaning Juvoly plans to have 64 Blackwell GPUs operational by late 2025. They also currently utilize earlier NVIDIA architectures like H100 (Hopper) and L40S (Ada Lovelace), which remain effective, though less efficient than their younger GPU cousins.
Interestingly, GPUs sometimes outpace CPUs in the tasks Juvoly sets out for them, creating bottlenecks. Kluiters notes scenarios where GPUs finish tasks in 12 milliseconds while CPUs take around 60 milliseconds, leading to precious idle time for the NVIDIA chips.
Juvoly also leverages the new GPU capabilities for large language models (LLMs), forming the backbone for real-time summaries in Juvoly’s QuickConsult. Physicians can instantly track symptoms discussed during consultations, reducing reliance solely on transcripts. For post-consultation summaries, Juvoly still utilizes Azure’s GPT-4o, but during conversations, open-source models like Gemma or Llama identify and classify symptoms.
The company’s objective is clear: running all workloads locally within NorthC Datacenters. Though buying hardware independently can seem daunting, Kluiters praises NorthC for making the transition straightforward. Rather than paying thousands monthly for cloud nodes, Juvoly now spends just a few hundred euros per month with dedicated hardware and ample room for growth.
More difficult than it seems
Piet Sjoukes, Director of Sales at NorthC Datacenters, elaborates on facilitating startups like Juvoly. He emphasizes continuity: “Our core service is reliability. Clients can’t afford downtime from cooling or power failures.” He humorously describes their real product as “a good night’s rest”—peace of mind for clients who rely heavily on uptime.
About half of NorthC’s clientele, including Juvoly, are high-tech innovators, often pushing hardware boundaries. “They operate at the cutting edge of technology,” says Sjoukes, highlighting AI’s immense computing demands. Data centers face challenges accommodating the vastly higher power density required today, sometimes exceeding 40kW per rack, compared to the traditional 3kW.
NorthC employs modular data center construction to manage these varying demands efficiently, blending traditional and advanced cooling methods like immersion, on-chip cooling, and hot aisle containment. Transparency and proactive client communication are essential for planning growth and infrastructure adjustments effectively.
NorthC also offers an AI-driven EcoSense system for real-time monitoring of power and cooling requirements, optimizing operations continuously. Personal interaction remains critical, with dedicated Customer Success Managers helping clients expand successfully within the NorthC ecosystem.
Conclusion: evolving needs
As healthcare increasingly seeks efficiency amid resource constraints, technology providers like Juvoly become invaluable partners. Juvoly demonstrates how innovative software paired with efficient, powerful hardware can significantly enhance healthcare delivery. While software drives meaningful improvements, it remains dependent on robust, energy-efficient infrastructure. Clear communication and collaborative planning between startups and data centers like NorthC prove essential to achieving sustained growth and innovation.
Ultimately, Juvoly’s approach highlights the value of targeted innovation—efficiently serving specific niches, like Dutch medical professionals, with tailored solutions. This careful integration of technology, infrastructure, and human-centered design promises substantial benefits for both doctors and patients.
Also read: Sustainability in data centers: where do things stand?