Microsoft says its spellchecker is “most comprehensive” ever

Get a free Techzine subscription!

The AI-powered “Speller100” aims to help deliver better Bing Search Engine Results.

Microsoft has announced a new AI system for spellchecking. They have named it Speller100, because it corrects spelling in over 100 languages, according to Microsoft.

A team of Bing Search and AI workers introduced the Speller100 this week in a blog post.They boasted that their Speller100 represented a new level of spellcheck performance. “We believe Speller100 is the most comprehensive spelling correction system ever made in terms of language coverage and accuracy,” they wrote.

The challenge: 1 in every 8 searches have misspellings

“In search we’ve found about 15% of queries submitted by customers have misspellings,” they claim. “When queries are misspelled, we match the wrong set of documents and trigger incorrect answers, which can produce a suboptimal results page for our customers.”

Therefore, they explain, spelling correction is the very first component in the Bing search stack. This is because searching for the correct spelling of what users mean improves all downstream search components, according to the researchers.

Previously, Bing had offered spelling correction for about two dozen languages for quite some time. However, that left users who issued queries in many more languages dealing with inferior results or manually correcting queries themselves, according to Microsoft.

AI provided the power behind the Speller100

“Our spelling correction technology powers several product experiences across Microsoft,” they explain. “Since it is important to us to provide all customers with access to accurate, state-of-the-art spelling correction, we are improving search so that it is inclusive of more languages from around the world with the help of AI at Scale.”

Indeed, AI plays a major role in Speller100’s performance. They say they expanded its capability by “leveraging recent advances in AI, particularly zero-shot learning combined with carefully designed large-scale pretraining tasks.” They also drew on historical linguistics theories, according to the team.