What question did this study set out to answer?

The aim is to enhance domain-specific machine translation using synthetic feedback for large language models.

February 2, 2026

Domain Adaptive Machine Translation with Synthetic Feedback for Large Language Models

Key Points

The aim is to enhance domain-specific machine translation using synthetic feedback for large language models.
Developed a pipeline for collecting in-domain translations and generating synthetic feedback.
Created a demonstration database pairing original translations with their revisions.
Employed Llama3-8B-Instruct and Mistral-7B-Instruct-v0.3 models for evaluation on various benchmarks.
Assessed the effectiveness of in-context retrieval methods and their impact on translation performance.
The pipeline significantly improved translation performance over standard methods.
There were notable differences in effectiveness across varying domains and languages.
Results indicated that larger retrieval databases enhanced translation refinement.
Quantitative analysis revealed improvements in sentence-level and word-level statistics.

Abstract

Domain-specific machine translation (MT) significantly benefits from large language models (LLMs) due to their strong instruction-following abilities and in-context learning (ICL) capabilities. Appropriate demonstration samples and feedback are essential for helping LLMs refine their translation outputs in real-world applications. However, the scarcity of in-domain samples and professional feedback creates practical limitations. Furthermore, the current ICL paradigm does not offer the fine-grained domain features in addition to parallel translation pairs. To address these challenges, we propose a pipeline that collects in-domain translations from LLMs and generates synthetic, human-like feedback for revising these translations. The translations and their corresponding feedback are stored together to build a demonstration database, with each instance paired with the original in-domain translation and its revision. During online translation, similar in-domain translations can be retrieved as revision demonstrations. This process guides LLMs in iteratively refining their outputs by learning from demonstrations. We evaluate the proposed pipeline using open-source models like Llama3-8B-Instruct and Mistral-7B-Instruct-v0.3, on five domain-specific benchmarks for English-centric, Chinese-centric and Portuguese-centric translation. The results demonstrate the effectiveness of the pipeline in tailoring in-domain translations and improving translation performance compared to direct translation instructions. Additionally, we discuss the experimental results from the following perspectives: 1) the effectiveness of different in-context retrieval methods; 2) the observed differences across selected domains and language; 3) the quantitative analysis of sentence-level and word-level statistics; and 4) the effect of ICL retrieval database size and decoding parameters.

Bookmark

Cite This Study

Yang et al. (Fri,) studied this question.

synapsesocial.com/papers/6980fefbc1c9540dea81190a https://doi.org/https://doi.org/10.1145/3787498

Bookmark