Microsoft has built an AI model that would detect the difference between security bugs and normal bugs with 99 percent accuracy. In the coming months, Microsoft plans to release the system, open-source, on GitHub.

In addition to the fact that the system distinguishes almost perfectly between security bugs and normal bugs, the AI also identifies critical high-priority security issues in 97 percent of all cases. The system was trained using a dataset of 13 million work items and bugs from 47,000 Microsoft developers stored in AzureDevOps and GitHub repositories. The model first learned to classify the difference between security bugs and normal bugs. Then the AI learned to apply labels – low-impact, important, and critical – to the security bugs.

The AI could be used to support human experts. Coralogix estimates that developers create 70 bugs per 1000 lines of code and that solving one bug takes thirty times longer to write a line of code. In the United States alone, 113 billion dollars a year is spent on identifying and fixing product defects.

How does the model work?

Microsoft says that the model is being put into production internally and that it is being continuously upgraded with data approved by security experts. They monitor the number of bugs generated in software development. “Every day, software developers stare at a long list of features and bugs that need to be addressed. Security professionals try to help by using automated tools to prioritize security bugs, but too often engineers waste time on false positives or miss a critical vulnerability that has been misclassified,” said Microsoft Senior Security Program Manager Scott Christiansen and Microsoft Data and Applied Scientist Mayana Pereira in a blog post. “We discovered that by linking machine learning models to security experts, we can significantly improve the identification and classification of security bugs.”

Microsoft’s model uses two techniques to make bug predictions. The first technique is a ‘term frequency-inverse document frequency algorithm’ (TF-IDF), an approach to retrieving information that attaches importance to a word based on the number of times it appears in a document and checks how relevant the word is in a collection of titles. The second technique involves a logistic regression model, and uses a logistic function to model the probability of a particular class or event.