Despite the widespread application of Neural Machine Translation (NMT) technology, existing systems still face challenges such as insufficient utilization of user feedback and limited improvement in translation quality for Japanese translation tasks. This paper proposes a Reinforcement Learning (RL)-based interactive machine translation feedback optimization system aimed at continuously enhancing Japanese translation quality through user interaction. The system first introduces a human-computer interaction interface to collect user evaluations and correction suggestions for initial translations. A reward function quantifies the consistency between the translation model’s output and user feedback. Subsequently, a policy gradient method updates the NMT model parameters, enabling the model to efficiently absorb user feedback. Furthermore, the paper innovatively combines a reward resampling mechanism to mitigate training instability caused by sparse feedback and introduces a dynamic weight adjustment strategy to enhance the effectiveness of diverse user feedback. Experiments conducted on the Japanese Patent Corpus and the WAT 2023 dataset show the system reduces the Translation Edit Rate (TER) to 39.6%, lowers the average interaction turns to a minimum of 1.21, and decreases the average user edit distance to 5.48. The RL-based interactive feedback optimization provides a novel approach to improving Japanese machine translation quality.
Cui et al. (Thu,) studied this question.