Apache Iceberg is now available on the Cloudera Data Platform, providing a completely open table format to customers.
Cloudera Data Platform uses analytics services and data management to gather data and consolidate them in a single place. Once the data is gathered, it analyzes and builds machine learning models.
Additionally, the CDP contains a variety of tools that can help perform complicated data preparations before a corporation’s business information is used in machine learning and data analytics.
Apache Iceberg
Apache Iceberg is now available on CDP. Apache Iceberg was initially developed by Netflix to overcome the hurdles of recent data lake designs on the Apache Impala, Hive, and Spark.
Apache Iceberg is open-source, which is rare for a high-level data table format. “Higher scale, more flexibility of analytic engines and services on the data lake, all without vendor lock-in”, said Ram Venkatesh, CTO at Cloudera.
Other benefits
The Apache Iceberg has a completely open-source, cloud-native format and functions like a primary engine for managing all services. Iceberg’s format allows data to be stacked on a single platform with relative ease.