What type of study is this?

September 10, 2025

Mixup Helps Translation, But Do the Coefficients and the Selection Strategy Influence Translation Quality?

Key Points

Mixup enhanced machine translation quality, achieving an average increase of 1.9 BLEU points over the baseline model.
The impact of different mixing coefficients was found to be minimal on translation quality across low-resource language pairs.
Employing either semantically similar or dissimilar samples for mixup did not result in significant improvements over standard mixup.
Systematic experiments conducted across several low-resource language pairs demonstrated the robust effectiveness of mixup.

Abstract

Mixup, an interpolation-based method that implicitly generates synthetic examples for training, has shown effectiveness in tasks such as image and text classification. Standard mixup randomly interpolates two samples of images and their labels. In this paper, we apply mixup to low-resource machine translation tasks by interpolating in the hidden space. We investigate the impact of different mixing coefficients on this technique. We also explore whether semantically related or unrelated samples provide more benefits for interpolation compared to random selection. To investigate this, we extend the standard mixup approach by selecting samples based on distance and experimenting with different sampling settings. Our experiments are conducted across several low-resource language pairs, including Lower Sorbian and Upper Sorbian, Lower Sorbian and German, and Upper Sorbian and German. Through systematic experiments on multiple language pairs, we evaluate the effectiveness of mixup data augmentation in improving low-resource machine translation performance. Our findings indicate that the standard mixup technique enhances the quality of machine translation, resulting in an average increase of 1.9 BLEU points over the baseline Transformer model. The choice of mixing coefficients has minimal impact on translation quality, which suggests that fine-tuning these coefficients is not essential to benefit from mixup. In addition, the standard mixup performs robustly, as selecting either the most similar or most dissimilar samples for mixing does not provide a significant improvement over it.

AI에게 질문

Bookmark

Cite This Study

Zhou et al. (Fri,) studied this question.

synapsesocial.com/papers/68c1afb954b1d3bfb60e70fd https://doi.org/https://doi.org/10.1145/3750043

AI에게 질문

Bookmark