What question did this study set out to answer?

This research aims to enhance automatic post-editing by utilizing word-level quality estimation for better translation accuracy.

June 17, 2026Open Access

Automatic Post-editing through Word-level Quality Estimation with Minimum Bayes Risk Decoding

Key Points

This research aims to enhance automatic post-editing by utilizing word-level quality estimation for better translation accuracy.
Developed a two-stage pipeline combining quality estimation and post-editing.
Employed Minimum Bayes Risk decoding to improve the quality estimation model.
Expanded MQM datasets with translations edited using GPT-4.
Accurate word-level quality estimation significantly improves translation quality.
MBR-enhanced quality estimation models outperform state-of-the-art baselines.
Edit Agreement Test-F1 metric effectively identifies over- and undercorrection in translations.

Abstract

Fine-grained word-level Quality Estimation (QE), such as Multidimensional Quality Metrics (MQM), provides error annotations that can enhance Automatic Post-Editing (APE) and APE evaluation. We studied a two-stage QE-assisted APE pipeline: a QE model tags error spans and a post-editor refines the translation conditioned on these annotations. We improved QE decoding using Minimum Bayes Risk (MBR) decoding. In addition, we introduce Edit Agreement Test-F1, a novel metric that measures over- or undercorrection by comparing QE predictions against gold annotations.We expanded two MQM datasets with post-edited translations generated using GPT-4.Experiments in four translation directions show that accurate word-level QE improves translation quality and that our MBR-enhanced QE models outperform state-of-the-art baselines.

Ask AI

Mark Helpful

Bookmark

Relay

View Full Paper