2 min

A team of researchers has come up with an algorithm that can be used to analyse Google searches and tweets to forecast COVID-19 outbreaks.

The team, steered by Harvard scientists Nicole Kogan and Mauricio Santillana, presented the algorithm in a journal submitted last Thursday to the arXiv scientific paper repository. During an interview with the New York Times, Santillana said that the algorithm could conceivably predict outbreaks about 14 days before they occur.

The research is a great breakthrough in the fight against coronavirus. It will help health officials to detect a rise in COVID-19 cases earlier and take more effective action to stop the spread of the pathogen.

Their forecasting approach, which is still waiting to be peer-reviewed, could be a significant move to enable that kind of visibility.

“Such a combined indicator may provide timely information, like a ‘thermostat’ in a heating or cooling system, to guide intermittent activation, intensification, or relaxation of public health interventions,” Santillana, Kogan and their colleagues wrote in the arXiv paper.

How the algorithm works

To make the forecasts, the algorithm analyses five different types of data. These include:

  • Google trends statistics on the most-looked up keywords.
  • Health officials’ search queries on the UpToDate medical data platform.
  • COVID-19 related tweets.
  • Anonymised smartphone mobile data.
  • Recordings from San Francisco-based Kinsa Inc.’s thermometers.

According to the researchers, if these data sources are analysed in aggregate, they can offer early insight into COVID-19 outbreaks. The researchers tested the algorithm by evaluating logs from the period leading to the incline in COVID-19 cases that were reported in New York in mid-match.

They identified a ‘sharp increase’ in COVID-19-related tweets more than seven days before cases started to rise. A similar ‘sharp uptrend’ was discovered in Kinsa measures and relevant Google searches.

The project is still in an early stage

Researchers warned that the project is still in the take-off stage. “These efforts represent an initial exploratory framework, and both continued study of the predictive power of digital indicators, as well as further development of the statistical approach, are needed,” they wrote in the paper.