Databricks wil standaarden maken voor vereniging machine learning en AI

Databricks, the company behind Apache Spark, aims to unite data, engineering and people. It wants to do this by defining standards for various processes, including distributed machine learning training, implementations and deployment. That’s what ZDNet‘s interview with CEO Matei Zaharia shows.

Much of this work Databricks wants to do with his own creation called MLFlow. This is a toolkit that should help to standardise the process of developing machine learning applications and to move them to production. According to Zaharia, however, everything starts with data engineering.

“In about 80 percent of the use cases, the ultimate goal of people is to use data science of machine learning. But to do this, you need a pipeline that can reliably collect data over a longer period of time. Both are important, but you need data engineering to do the rest. We focus on users with large volumes, which is more challenging. If you use Spark for distributed processing, you have a lot of data.”

However, this often also means that the data comes from various sources. Now Spark and Data – the cloud platform of Databricks built on Spark – support all reading and writing to a large number of data sources. But Databricks now wants to go one step further, by unifying different frameworks for machine learning from the lab to production via MLFlow.

It also builds a standard framework for data and execution via Project Hydrogen. This means that the data and the execution are united, different ML-frameworks data can be exchanged and the training and the interference process are standardized.

MLFlow

The goal of MLFlow is to provide support in following up experiments, sharing and reusing projects and developing production models. Not only will it be possible to deploy ML models on Spark and Delta, but MLFlow can also export them as REST services that can be run on any platform, or on Kubernetes. Cloud environments are also supported. It now concerns AWS SageMaker and Azure ML.

This news article was automatically translated from Dutch to give Techzine.eu a head start. All news articles after September 1, 2019 are written in native English and NOT translated. All our background stories are written in native English as well. For more information read our launch article.

Expert Talks

Tech calendar

Databricks wants to make standards for association machine learning and AI

MLFlow

Stay tuned, subscribe!

DevRev challenges SaaS: AI data integration should (and can) be done better

Pega wants to make AI performance and cost predictable

SCION wants to make the foundations of the internet safer

AI is a top priority, but there is also distrust about use in cybersecurity

How Nutanix is tackling multi-cloud Kubernetes and AI workloads

Why hyperscalers run containers in VMs: VKS deep dive

ServiceNow unveils Action Fabric AI platform architecture

No backdoors, no excuses: Cisco bets big on sovereign infrastructure

AMD “Helios”: Building rack-scale AI Infrastructure for EMEA Enterprises

Taking the right lessons from AI success stories

Why traditional security can’t protect your enterprise against AI threats

Power critical workloads with all-NVMe active-active storage for non-stop enterprise operations

Dreamforce

GOTO Copenhagen 2026

NetApp INSIGHT 2026

Manhattan EMEA Exchange

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices