Google Cloud has made the Datastream for BigQuery tool generally available. This allows developers to “stream” data updates from sources into BigQuery in near real-time.
The tool had been in beta since September of 2022, but is now generally available. Datastream for BigQuery allows updates from sources to be replicated in near real-time in BigQuery tables.
With this form of data integration, developers no longer need to build data pipelines or program their ETL and ELT processes themselves. Data integration in BigQuery is faster and more efficient as a result, the thinking goes.
Benefits brought by the solution, according to Google Cloud, include real-time insights into BigQuery and serverless ELT and ETL pipelines that automatically scale by eliminating the need to set up or manage resources.
In addition, Datastream for BigQuery allows source schemas to change. Thus, it provides “schema drift” without problems and automatically replicates new columns and tables in the source to the BigQuery environment. The solution uses its new change data capture (CDC) and Storage Write API’s UPSERT features for this purpose.
Users need only configure the source database, connection type and destination in BigQuery. Databases supported include MySQL, PostgreSQL, AlloyDB and Oracle databases.
Also read: Google Cloud Assured Open Source Software now available