The growing complexity and volume of contemporary data pipelines have boosted the significance of smart data quality monitoring infrastructures. The traditional rule-based techniques tend to fail or provide unreliable analytics in dynamic and high-throughput environments, causing silent failures. This review explores the possibility of artificial intelligence (AI) and machine learning (ML) leveraging the use of adaptive data quality alerting systems that can be implemented in scale. It gives importance to architecture concepts, model approaches, and tooling environments that help in anomaly detection and automated remediation through self-healing pipelines in real-time. The argument is furthered along the artifacts of anomaly detection models, streaming data platforms, orchestration frameworks, and feedback-based model retraining. Some important contributions are a proposal of a modular architecture that can perform real-time alerting and classification of tooling options depending on each stage of the pipeline, and an overview of governance considerations. The research areas are defined as gaps that need to be addressed in the field of model interpretability, real-time integration, and operational benchmarking of autonomous, intelligent data quality management systems in a distributed environment. The review ends with the suggested route of study development.
Building similarity graph...
Analyzing shared references across papers
Loading...
Gayatri Tavva
World Journal of Advanced Engineering Technology and Sciences
Building similarity graph...
Analyzing shared references across papers
Loading...
Gayatri Tavva (Wed,) studied this question.
www.synapsesocial.com/papers/68c1a8fe54b1d3bfb60e1c7d — DOI: https://doi.org/10.30574/wjaets.2025.16.1.1235