Robust vehicle detection in real-world traffic surveillance remains challenging due to degraded imagery caused by motion blur, adverse weather, and low illumination, which significantly increases detector sensitivity to hyperparameter configurations. This study proposes a “Frugal AI” distributed multi-GPU framework that optimizes hyperparameters via a stochastic simplex-based search coupled with five-fold cross-validation. Utilizing three low-cost NVIDIA GTX 1050 Ti GPUs, the framework performs parallel candidate exploration with an asynchronous model-level exchange mechanism to escape local optima without the overhead of gradient synchronization. Seven CNN backbones—VGG16, VGG19, GoogLeNet, MobileNetV2, ResNet18, ResNet50, and ResNet101—were evaluated within YOLOv2 and Faster R-CNN detectors. To address memory constraints (4 GB VRAM), YOLOv2 was selected for extensive benchmarking. Performance was measured using a harmonic precision–recall-based cost metric to strictly penalize imbalanced outcomes. Experimental results demonstrate that under identical wall-clock time budgets, the proposed framework achieves an average 1.38% reduction in aggregated cost across all models, with the highly sensitive VGG19 backbone showing a 4.00% improvement. Benchmarking against Bayesian optimization, genetic algorithms, and random search confirms that our method achieves superior optimization quality with statistical significance (p < 0.05). Under a rigorous IoU = 0.75 threshold, the optimized models consistently yielded F1-scores 0.8444 ± 0.0346. Ablation studies further validate that the collaborative model exchange is essential for accelerating convergence in rugged loss landscapes. This research offers a practical, scalable, and cost-efficient solution for deploying robust AI surveillance in resource-constrained smart city infrastructure.
Tsai et al. (Fri,) studied this question.