Pulse Journal Club Trending Explore Questions Researchers

Download the App

Join discussions, follow papers, and never miss your next session.

© Synapse Social LLC, 2026Privacy Policy

Home Explore Journal Club Trending

⌘+K

Randomized Significance Tests in Machine Translation | Synapse

January 1, 2014Open Access

Randomized Significance Tests in Machine Translation

Key Points

Key points are not available for this paper at this time.

Abstract

Randomized methods of significance test-ing enable estimation of the probability that an increase in score has occurred sim-ply by chance. In this paper, we examine the accuracy of three randomized meth-ods of significance testing in the context of machine translation: paired bootstrap resampling, bootstrap resampling and ap-proximate randomization. We carry out a large-scale human evaluation of shared task systems for two language pairs to provide a gold standard for tests. Re-sults show very little difference in accu-racy across the three methods of signif-icance testing. Notably, accuracy of all test/metric combinations for evaluation of English-to-Spanish are so low that there is not enough evidence to conclude they are any better than a random coin toss. 1

Mark Helpful

Bookmark

Relay

View Full Paper

Mark Helpful

Bookmark

Relay

View Full Paper

Cite This Study

Graham et al. (Wed,) studied this question.

synapsesocial.com/papers/6a0e9cf506ecbe8334479d33 https://doi.org/https://doi.org/10.3115/v1/w14-3333