Label smoothing is a widely used technique in various domains, such as text classification, image classification and speech recognition, known for effectively combating model overfitting. However, there is little fine-grained analysis on how label smoothing enhances text sentiment classification. To fill in the gap, this article performs a set of in-depth analyses on eight datasets for text sentiment classification and three deep learning architectures: TextCNN, BERT, and RoBERTa, under two learning schemes: training from scratch and fine-tuning. By tuning the smoothing parameters, we can achieve improved performance on almost all datasets for each model architecture. Specifically, our experiments demonstrate that label smoothing improves accuracy by 0.5–2.3 percent across different architectures, with the best results achieved using smoothing parameters λ∈0.01,0.1 for three-class datasets and λ∈0.01,0.15 for binary-class datasets. We further investigate the benefits of label smoothing, finding that label smoothing can accelerate the convergence of deep models by 15–30 percent and make samples of different labels easily distinguishable. Additionally, we provide comprehensive analysis including macro-F1, precision, and recall metrics to ensure robust evaluation across datasets with varying class distributions.
Si et al. (Thu,) studied this question.