Accurate detection of tomato ripeness and size is critical for robotic thinning and harvesting but remains challenged by performance degradation in adverse weather, imprecise size estimation, and computational constraints on edge devices. To bridge this gap, we introduced (1) the TIDAW dataset (Tomato Images in Diverse Adverse Weather), synthetically generated via a physically-grounded atmospheric scattering model to simulate realistic rain and fog; and (2) Edge-YOLO-Tomato, a novel YOLOv8-based architecture, featuring four key innovations: a physics-aware scattering module that unifies multi-particle light transport theory with dual-attention mechanisms to explicitly model wavelength-dependent scattering for robust feature disentanglement; dilated convolutions enhancing receptive fields; a prior-embedded Wise-IoU loss incorporating botanical size distribution priors to rectify bounding box bias; and a compression framework that combines magnitude pruning and layer-wise pruning using neural architecture search. Extensive evaluations demonstrate leading performance: Edge-YOLO-Tomato achieves 93.3% mAP50 and 74.3% mAP50:95 on TIDAW, surpassing YOLOv8, YOLOv11, Faster R-CNN, and RT-DETR etc. by 1.1%-26.3% and 0.2%-2.2%, respectively. The compressed model attains a 4.7373 MB footprint (20.58% size reduction) with ≦ 0.5% accuracy loss and delivers 50% latency reduction on CPU. This work establishes a new paradigm for vision-based precision agriculture by unifying physical data synthesis, physics-aware modeling, and compression framework, enabling real-time robust fruit detection in uncontrolled environments. The codes are available at https://github.com/YLu567/Edge-YOLO-Tomato.
Cai et al. (Thu,) studied this question.