Addressing challenges in few-shot image classification, this study introduces the Multi-Dimensional Attention and Composite Loss Former, a meta-learning model built on a Residual Network-12 backbone. The model incorporates multi-dimensional attention mechanisms and is trained with a composite loss function applied across the entire architecture. It enhances feature extraction by dynamically focusing on critical local and global information, while the composite loss optimizes classification accuracy, emphasizes hard samples, suppresses overfitting, and promotes intra-class feature compactness. Comprehensive experiments conducted on the miniImageNet and tieredImageNet datasets demonstrate that the proposed model achieves superior performance in both meta-training and meta-testing stages compared to existing benchmarks, effectively validating its robustness and generalization capabilities in few-shot learning tasks.
Shi et al. (Sun,) studied this question.