Databricks launches new tools for machine learning

Get a free Techzine subscription!

The American big data company Databricks introduces a number of tools for building machine learning models. The Automated Machine Learning (AutoML) tools should enable citizen data scientists to build their own machine learning models.

The tools become part of the Unified Analytics platform of Databricks. They are intended to enable non-trained or less-trained users to complete the complex machine learning process. Thus, non-experts can also use machine learning to make predictions about ‘the real world’ from a machine learning algorithm.

Machine learning for untrained users

The creation of machine learning models normally requires a high level of knowledge and skills. Databricks states that it can automate crucial parts of the process with AutoML. Steps such as hyperparameter tuning, feature engineering, automatic model tracking, reproducibility and rollout are examples of this. “With the introduction of the ‘low code’ and ‘no-code’ concepts, AutoML is fundamentally changing the way organisations use machine learning and approach data science,” says Adam Conway, vice president of product management at Databricks. “With the right automation, AutoML can drastically reduce the time-to-value for data science teams.”

“There simply aren’t enough expert, experienced and trained data scientists in the world to do all the work manually at the speed and scale required for modern machine learning,” says James Kobielus, analyst at Wikibon at SiliconAngle. “These AutoML announcements focus on a gap in the market for comprehensive programming tools to help the next generation of citizen data scientists automate more of the development, training and tuning of ML models.”

New features part of MLflow

The new features will be part of Databricks’ MLflow. MLflow is an open-source platform that has been available since last year. The platform is used to package, test and deploy machine learning code across multiple cloud services.

MLflow uses Apache Spark, the most important part of the Unified Analytics Platform of Databricks. This platform is used to analyze data, create data pipelines and build labeled datasets to create machine learning models.

This news article was automatically translated from Dutch to give a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.