Different anomaly detectors disagree more often than they agree on financial streams, and without ground-truth labels there is rarely a principled way to decide which detector to trust on any given window. This work reports a head-to-head streaming comparison of four widely used unsupervised detectors Robust Random Cut Forest, Half-Space Trees, an Incremental Local Outlier Factor variant, and a streaming LightGBM classifier under a common sliding-window protocol. To address the label-scarcity problem we introduce a Kalman-filter-based predictive-stability validator: anomalies flagged by each detector are removed from the window, a Kalman state-space model is re-estimated on the cleaned series, and the reduction in the residual sum of squares is recorded as a label-free quality score. We evaluate every detector on two complementary datasets the public Kaggle NIFTY-50 minute-level dataset and a 46-column intraday Fyers high-frequency feed and confirm consistent ordering across both. The proposed convex-fusion detector achieves F1 of 0.943 on Kaggle and 0.957 on Fyers, with an end-to-end per-window latency of 9.2 ms at W = 500. The Kalman validator yields a 21.4 percent residual-error reduction relative to no removal, exceeding the 14.4 to 18.1 percent reduction achieved by individual detectors. Two contributions follow: a label-free streaming validation procedure that does not depend on weak proxy labels, and a reproducible cross-detector cross-dataset benchmark against which future real-time financial detectors can be compared.
Gupta et al. (Thu,) studied this question.