This paper presents an adaptive statistical framework for real-time data quality detection in streaming pipelines, replacing hardcoded rule-based validation with distribution-aware monitoring. The framework combines Kolmogorov-Smirnov testing, Population Stability Index scoring, and Wasserstein distance measurement with an empirically validated recalibration heuristic. Validated on 2,136,000 synthetic records across 90 days and 1,424 micro-batches. Results demonstrate 98.4% false positive reduction versus rule-based baseline with 0 incorrect recalibrations over the full simulation period.
Building similarity graph...
Analyzing shared references across papers
Loading...
Thanigaivendhan Thiyagarajan (Thu,) studied this question.
synapsesocial.com/papers/69bf898bf665edcd009e94d5 — DOI: https://doi.org/10.5281/zenodo.19129208
Thanigaivendhan Thiyagarajan
Building similarity graph...
Analyzing shared references across papers
Loading...