Visma’s AI team is quietly redefining document processing across Europe. With a background spanning nearly a decade, Visma Machine Learning Assets now handles over 18 million documents per month with its models. Ultimately, it’s powering key business processes through highly specialized AI models. What began as an effort to streamline accounting workflows has grown into a far-reaching initiative that blends state-of-the-art AI with real-world business needs.
Like many AI teams in the mid-2010s, Visma’s group initially relied on traditional deep learning methods such as recurrent neural networks (RNNs), similar to the systems that powered Google Translate back in 2015. But around 2020, the Visma team made a change. “We scrapped all of our development plans and have been transformer-only since then,” says Claus Dahl, Director ML Assets at Visma. “We realized transformers were the future of language and document processing, and decided to rebuild our stack from the ground up.”
This shift came before transformer-based systems like ChatGPT hit the mainstream. The team’s first transformer-powered product entered production in 2021, allowing Visma to gain a valuable head start in adopting cutting-edge NLP technologies.
Document extraction at scale
The team’s flagship product is a robust document extraction engine that processes documents in the countries where Visma companies are active. It supports a variety of languages. The AI could be used for documents such as invoices and receipts. The engine identifies key fields, such as dates, totals, and customer references, and feeds them directly into accounting workflows.
Complementing this is Auto Suggest, a tool that automatically labels transactions using learned business behavior. The system adapts to individual organizations, suggesting the right account numbers, department codes, and other necessary metadata based on past activity.
What users experience as a seamless interface is actually a federation of around 50 specialized models working together under the hood. Each model is tailored to specific tasks or data structures, and selected dynamically depending on the query type. This modularity ensures optimal performance without compromising the user experience.
Multilingual by design
Language diversity has been built into the system from day one. The team has tested their models in about 20 different languages, achieving strong results even in lower-resource contexts. While English remains the most robust due to its abundant training data, Dahl emphasizes that the system handles multilingual scenarios well, especially for straightforward information extraction tasks. “If a document is in Vietnamese, we can still find the date and amount,” he notes. “It becomes more complex when you’re trying to extract something like a project reference. That’s where deeper context and language-specific understanding are needed.”
To support this, the team relies on multilingual foundation models, which allow users to interact with the system in their preferred language, whether they’re uploading documents or querying their contents.
Efficient training for high-impact models
One of the most compelling aspects of Visma’s approach is its training strategy. Instead of chasing massive datasets, the team prioritizes quality over quantity. Each year, they process around 200 million documents from approximately 100 countries. These documents form the basis for a large general model that learns to understand diverse document formats and layouts. Once this foundation is in place, smaller, highly specialized models can be trained using as few as 50 carefully selected examples.
“High-quality data is more valuable than high volumes. We’ve invested in a dedicated team that curates these datasets to ensure accuracy, which means our models can be fine-tuned very efficiently,” Dahl explains. This strategy mirrors the scaling laws used by large language models but tailors them for targeted enterprise applications. It allows the team to iterate quickly and deliver high performance in niche use cases without excessive compute costs.
Bridging automation and insight
As AI matures, document processing is no longer just about automation. Increasingly, it’s about delivering real-time insight. Visma’s systems are now capable of achieving error rates between 1–3%, close to human-level performance. This accuracy is achieved through a dual-layered quality monitoring approach. One team tracks user performance metrics, while another audits document results in real-time. Together, they provide the oversight needed to ensure consistency across thousands of business cases and formats.
“Accuracy is critical, but so is adaptability,” Dahl notes. “Every business is slightly different, so we focus on making the AI flexible enough to learn from that context.” This is to give the customer a software solution that could replace manual tasks. Additionally, helping businesses extract the meaning of documents could enable them to make more informed decisions more quickly.
Addressing language and format challenges
Despite their success, challenges remain. Documents in unfamiliar languages or with highly localized formats can complicate extraction. Standard fields, such as dates and amounts, translate well across languages, but nuanced elements, like contract references or project numbers, often require a deeper semantic understanding. This is especially true when the format lacks consistency.
“There’s a big difference between reading a standard invoice and interpreting a free-form contract,” says Dahl. “That’s where specialized training and context awareness become essential.” To overcome these challenges, the team relies on transformer models’ ability to infer meaning from structure and context rather than depending purely on keyword matches or templates.
Toward more autonomous AI systems
One of the emerging trends the Visma team is exploring is extending the autonomy of their AI systems. While most AI operates in short bursts, for example, processing a document or handling a transaction, the goal is to develop systems that can sustain coherent operation over longer periods. This mirrors developments seen in software agents, but comes with its hurdles. Unlike public code repositories used to train coding AIs, most business data is confidential.
“There aren’t that many businesses putting all their accounts on the Internet. So you have to find creative ways to train models while respecting privacy”, Dahl notes. Still, the ambition remains to create AI that can reason over time, draw insights across documents, and serve as a true business partner.
Contrary to fears that AI will replace workers, Dahl argues that the opposite is happening in business administration. There’s a shortage of qualified professionals, and AI is helping to close that gap. “I’ve heard of accountants letting clients go because they couldn’t serve them profitably,” he says. “AI allows these firms to handle more clients without compromising quality.”
Towards AI readiness
The conversation around AI in business has also evolved. Risk assessments and legal concerns dominated early discussions. Today, many professionals have first-hand experience with AI through translation tools, content generation apps, or even casual interactions. Businesses now approach AI with practical expectations, evaluating it based on performance, ease of integration, and return on investment. “We’ve moved past the hype,” Dahl reflects. “People are asking: does it work for my business, and how fast can I use it?”
Visma’s AI journey shows how a focused, specialized approach to machine learning can deliver results at scale. By prioritizing efficiency, multilingual support, and legal compliance, the team has built a foundation for intelligent automation. As AI systems become more autonomous, contextual, and integrated, they’re evolving into tools that help businesses make smarter decisions.
Tip: AI in software development: from experiment to standard