Databricks provides universal data format in Delta Lake 3.0

Databricks provides universal data format in Delta Lake 3.0

Databricks recently presented its Delta Lake 3.0 platform. A key feature of this platform is the introduction of Universal Format (UniForm).

With the introduction of the new version of its Delta Lake platform, Databricks aims to provide the main foundation for the Linux Foundation’s open-source data lakehouse initiative Delta Lake. An open-source data lakehouse architecture gives companies more choice and flexibility for a data lakehouse. At least, that’s the thinking behind this initiative.

UniForm format

Within its Delta Lake version 3.0 solution, Databricks gives a lot of care and attention to integrating the three data formats that companies may encounter for a data lakehouse. Of course, this includes the open-source format Delta Lake as well as Apache Iceberg or Apache Hudi.

In Delta Lake 3.0, UniForm allows the Delta Lake format to be read as if it were an environment of the other two formats. This enables a data ecosystem that can handle all three data formats.

Kernel and Liquid Clustering

Delta Lake 3.0 also introduces Delta Kernel. This feature addresses connector fragmentation by building connectors based on a central Delta library that secures all Delta specifications. This reduces the need for users to update their Delta connectors after a new release or protocol change.

Furthermore, Delta Lake 3.0 from Databricks features Delta Liquid Clustering. This feature improves performance around read and write. It provides a flexible data layout technology that should enable cost-effective and scalable data clustering. This is instead of the more traditional hive-like table partitioning with a fixed data layout.

Databricks Delta Lake 3.0 release is now available in preview.

Also read: Databricks expands DeltaSharing ecosystem