AWS has made the Glue Data Quality service available. With the service, companies can more easily optimize data quality within all their data lakes and pipelines.
According to AWS, when creating data lakes, many companies do not pay close attention to the quality of the data housed within them. As a result, according to the tech giant, data lakes tend to become “data swamps.
Improving data quality is often a complex and lengthy process for engineers. Especially with the meticulous manual digging out of the data, formulating data quality requirements and coding in alerts for deteriorating data quality.
AWS Glue Data Quality service
To this end, the tech giant is marketing its AWS Glue Data Quality service that aims to speed up taking care of good data quality by reducing these manual tasks. The service automatically calculates statistics, provides examples for quality rules, monitors data and sends alerts when it detects that quality is deteriorating.
In this way, monitoring data quality should become more efficient and prevent potential negative problems for business use.
The new service is a serverless feature of the AWS Glue service and also further relieves infrastructure management and maintenance. Users can access the service through various platforms, such as the AWS Glue Data Catalog, Glue Studio and Glue Studio notebooks, as well as from their preferred code editors.