Key points are not available for this paper at this time.
In statistical natural language processing we always face the problem of sparse data. One way to reduce this problem is to group words into equivalence classes which is a standard method in statistical language modeling. In this paper we describe a method to determine bilingual word classes suitable for statistical machine translation. We develop an optimization criterion based on a maximum-likelihood approach and describe a clustering algorithm. We will show that the usage of the bilingual word classes we get can improve statistical machine translation.
Franz Josef Och (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: