4 min

Real-time, if you will excuse the obvious expression, is real.

The ubiquity of mobile connectivity, the breadth and depth of cloud, the always-on nature of new intelligent machines, the presence of autonomous automation advantage(s) and the birth of RPA bots (plus a litany of other major paradigm-affecting developments) have brought us to the point where real-time compute engines and data management are the now an imperative.

But real-time computing is (again, obviously) no plug-and-play affair, especially when enterprises now need to thread together multiple streams of live data and merge them with large volumes of stored data to provide historical context. 

A baseline yardstick

It’s a point worth re-emphasising i.e. real-time data has massive implications in terms of its ability to deliver ‘instant gratification’ functionalities to applications and services at many levels, but it will always need a grounding base with historical data as a baseline yardstick.

Without the right level of stream-to-stream join intelligence, real-time streams remain disconnected and have the potential (when two or more streams need to coalesce for a particular job or function) to cause so-called ‘wait & see’ scenarios, a term related to the ‘batch’ era of overnight processing where the write-to-database process has to happen first.

Using its appearance at KubeCon + CloudNativeCon North America this month to explain how it is aiming to address this IT fabric connection challenge, Hazelcast has positioned its real-time stream processing platform as an antidote to the wait & see paradigm. 

“Thriving in the real-time economy requires instantaneous computation on both new and historical data, something traditional databases cannot do. After years of building our extremely reliable, low-latency data store, we’re focusing on the convergence with real-time data to give enterprises a new approach to improving customer satisfaction, generating new revenue and mitigating risk,” said Kelly Herrell, CEO of Hazelcast. 

The company now offers what it calls ‘zero-code connectors’ (components of software code designed to perform the stream joining task in question here) to accelerate the speed at which an enterprise can realise the benefits of stream processing and real-time applications in their existing infrastructure. The zero-code connectors work via a declarative method (a high-level programming principle designed to enable developers to ‘declare’ a desired objective or outcome from a process without specifying the exact procedure through which that outcome is achieved – hence a means of abstracting away the logic layer) to retrieve contextual data from existing data platforms to simplify how application developers access and query data, as well as leverage pipelines. 

The Hazelcast zero-code connectors are in beta and currently support AWS Relational Database Services (RDS) for MySQL and PostgreSQL – and additional connectors will be available in upcoming releases.

Database-centric? No thanks

“To embark on the path to real-time, application architects must rely on something other than a database-centric approach. Databases inherently suffer from a processing bottleneck because data must be written before being analyzed,” asserted Herrell and team, in a press statement.

The stream processing engine of the Hazelcast Platform performs upstream, in-flight computation and simultaneously merges the data with the historical context withheld in the built-in low-latency data store.

With the added stream-to-stream join functionality, enterprises can merge multiple data streams and handle late-arriving records. For example, an online business application may monitor both streams of orders and shipments to confirm accurate fulfilment, a success metric that keeps customers and users satisfied and loyal. 

By combining multiple streams with a fast low-latency data store, enterprises are here promised the chance to analyse and take action on data without waiting for it to be written to a traditional database in the batch cycle already noted. This development could be is a competitive advantage for enterprises where time is of the essence.

Batch is dwindling

“With time becoming the key competitive advantage for most businesses, the days of batch processing are dwindling,” said Manish Devgan, chief product officer of Hazelcast. “The Hazelcast Platform 5.2 release is yet another step towards helping our customers move to creating real-time actionable insights and delivering new digital experiences.” 

According to Devgan, data unable to be queried for context enrichment of real-time insights is a lost opportunity. 

Announced earlier this year, the Hazelcast Platform includes a Tiered Storage function that allows users to keep hot data in memory to increase throughput and reduce latency and maintain cold data in more cost-effective and operationally appropriate locations. 

The Tiered Storage capability is said to be highly performant and its performance is on par with the Hazelcast Platform High-Density Memory Store feature. As an additional benefit, Tiered Storage enables users to enrich real-time data with reference data stored on NVMe-based SSDs.

Also here we find a Management Center SQL Browser, an element of the Hazelcast platform designed to offer improved usability of of its management center to execute streaming SQL Compact Serialization – ready for production environments, ‘serialization’ now consumes less space, does not need editing and enables seamless data model evolution.

Hazelcast is of course not the only fruit in the data streaming apple basked and usual suspects in this space include Confluent, Amazon Kinesis Data Streams, TIBCO with its Spotfire product and IBM Stream Analytics, the Apache family (Kafka, Storm, NIFI, Flink) and Google Cloud DataFlow to name a few. It’s a growing market (or perhaps that should sub-market of data science and data management) that is seeing exponential growth right now.

Hold your breath for low-code data streaming democratisation for all users, it’s inevitable.