4 min Applications

Confluent acquires WarpStream, awakens BYOC data stream dream

Confluent acquires WarpStream, awakens BYOC data stream dream

Data streaming platform company Confluent, Inc. has acquired WarpStream, an Apache Kafka-compatible data streaming platform. Open source Apache Kafka is a distributed data streaming technology that works to process and store data in real-time. Confluent’s cloud-native foundational platform is designed to be an ‘intelligent connective tissue’ for enabling real-time data from multiple sources to constantly stream across the organization. WarpStream is known for its Bring Your Own Cloud (BYOC) data streaming architecture for running large-scale workloads with what is refered to as ‘relaxed latency requirements’ (i.e. not every piece of data is necessarily as urgent as another piece) in their own cloud environment, which is ideal for logging, tasks such as observability and the ‘feeding’ of data lakes. 

Any cloud you like

The concept here – with the acquisition of WarpStream – is that Confluent now has a data streaming offering selection pack that spans fully managed services running with Confluent Cloud, as self-managed instances on Confluent Platform, or BYOC with WarpStream.

“Confluent wants to offer data streaming to all customers with all requirements and workloads,” said Jay Kreps, co-founder and CEO, Confluent. “I’ve been deeply impressed with WarpStream – it’s BYOC done right. With this acquisition, we have a data streaming offering for everyone.”

Kreps and team suggest that many organisations struggle with a fractured and siloed nature of the data estate. According to Confluent’s latest data streaming report, 91% of IT leaders are banking on data streaming platforms to drive their organisation’s data goals forward. 

Core to its product set, the company says that Confluent Cloud (a fully managed service) and Confluent Platform (a self-managed offering) are both data streaming platforms that stream, connect, process and govern data, but they sit at opposite ends of the spectrum of operational burden and flexibility. Confluent Cloud eliminates the operational burden with the trade-off of control. Confluent Platform provides greater flexibility at the cost of a higher operational burden. The choice between the two depends on the specific needs and capabilities of an organization.

BYOC is a ‘third way’

BYOC has emerged as a third option that falls in between fully managed and self-managed data streaming. 

“Today we can say that BYOC uses a ‘shared responsibility framework’ where the customer and the vendor are jointly responsible for the operations and health of the system. This joint responsibility gives customers more deployment flexibility, albeit at the cost of some operational overhead. This model benefits instances when regulatory or contractual barriers prevent a customer from using a fully managed solution,” notes the company, in a product statement.

WarpStream co-founders Richard Artoul and Ryan Worl say that they developed their cloud delivery approach because they ‘realised’ that if they could separate compute and storage to reduce the operational burden at the database layer, the same could also be done for the data streaming layer. WarpStream’s BYOC approach is built directly on object storage (which is just how Confluent’s Kora engine works, 

Confluent’s own cloud-native Apache Kafka engine) and brings managed data streaming benefits into a customer’s cloud.

“Together with Confluent, we will continue to ensure that Kafka-compatible data streaming is accessible to every organisation,” said Richard Artoul, co-founder and CEO, WarpStream. Both Artoul and Kreps agree that – under Confluent’s product umbrella -, WarpStream will continue its BYOC mission while pushing forward with what both men promise is an ‘ambitious’ roadmap. 

In time, features like processing and governance will be added to WarpStream BYOC to provide a complete data streaming platform solution for high-volume logging and observability workloads.

Did someone say industry consolidation?

The data streaming industry (and indeed the open source community) will be watching developments like this with a keen eye. While the Red Pandas and the StreamNatives of this world may have opinions about industry consolidation and the growth of an enterprise-grade Apache technology, the data science developer use cases will (in all probability) tell the final story if software engineering teams get what they want in terms of mechanics and system fundamentals that make real-time data more eminently usable. No doubt more will unfold at Confluent’s developer practitioner event Current this year.