As the demand for intelligent dietary management continues to grow, food recognition technology shows great application prospects of areas like mobile platforms. In tackling the essential challenge of food image classification, the You Only Look Once (YOLO)v8s-cls architecture is augmented with the Focal Loss function in this paper, aiming to improve its discrimination capability for challenging or minority-class samples. The model is trained on a large food dataset containing 288 classes, and is optimized using data augmentation strategies such as MixUp and CutMix. It was evaluated in multiple application scenarios, including normal images, occluded images, blurred images, and high/low-frequency classes subsets. On normal images, the enhanced model attained 80.64% in Top-1 accuracy and 95.73% in Top-5 accuracy, outperforming the original model in robustness under challenging conditions. Overall, the results indicate that the Focal Loss function has improved the models capability to classify diverse food images, supporting the practical implementation of intelligent dietary systems.
Sheng Ran (Tue,) studied this question.