2 min

The new service delivers flexible capacity and scalable deployment models

This week Databricks announced the general availability of Databricks Model Serving, a serverless real-time inferencing service that deploys real-time machine learning (ML) models natively within the Databricks Lakehouse Platform.

Model Serving removes the complexity of building and maintaining complicated infrastructure for intelligent applications. The service is exposed via a Representational State Transfer or REST application programming interface.

“Now, organizations can leverage the Databricks Lakehouse Platform to integrate real-time machine learning systems across their business, from personalized recommendations to customer service chatbots, without the need to configure and manage the underlying infrastructure”, Databricks claims.

Flexible capacity on a unified platform

As a serverless deployment, Databricks’ infrastructure expands and contracts according to the needs of the machine learning model. This offers highly flexible capacity, according to the company.

Databricks Model Serving is the first production-grade model serving solution developed on a unified data and AI platform based on the company’s lakehouse data warehouse/data lake hybrid. The service integrates with other lakehouse services, including Databricks Feature Store for automated online lookups, MLflow Model Registry for model deployment, Unity Catalog for unified governance, and its quality and diagnostics tools.

Deploy multiple models with a single endpoint

The company is also introducing serving endpoints, which uncouple the model registry and the scoring uniform resource identifier. This gives developers a way to deploy multiple models behind a single endpoint and distribute traffic as needed among them.

“Databricks Model Serving accelerates data science teams’ path to production by simplifying deployments, reducing overhead and delivering a fully integrated experience directly within the Databricks Lakehouse,” said Patrick Wendell, Co-Founder and VP of Engineering at Databricks.

“This offering will let customers deploy far more models, with lower time to production, while also lowering the total cost of ownership and the burden of managing complex infrastructure.”