Databricks has acquired Arcion for €94 million ($100 million). With this, Databricks adds data integration and replication functionality to data lakehouses.
The acquisition of Arcion is a strategic move for Databricks, which previously invested in the startup. Arcion’s technology had until now been missing from Databricks, for data generated by customers themselves. Clients often needed technology from other providers for this purpose.
The Databricks platform does offer connections to data ingestion technology, such as Microsoft Azure Data Factory and Fivetran, to move data directly from the source. There is also a connection for AutoLoader that can ingest data from cloud-based data warehouses such as AWS, Google Cloud and Microsoft Azure.
Benefits for users
Especially those customers engaged in developing and especially training AI and ML models can benefit from Arcion. This is because the Arcion platform pulls in data in real-time from various databases and applications.
The Arcion platform runs on a data capture engine that pulls in data the moment it is created. In addition, the platform connects to more than 20 different databases and data warehouses.
This will soon enable Databricks to better ingest and replicate streaming data, add its own governance and security functionality and make data actionable for real-time analytics.
Unknown roadmap
As to the roadmap to follow for the further integration of Arcion, Databricks makes no statements yet. For example, it is unknown whether the lakehouse vendor will also integrate Arcion’s no-code interface into its platform. This no-code interface allows users to develop streaming pipelines with natural language processing via generative AI technology.
Also read: Databricks optimizes LLM deployment on Lakehouse platform