Key points are not available for this paper at this time.
Despite being the focus of intensive research, evaluation of algorithms that generate referring expressions is still in its infancy. We describe a corpus-based evaluation methodology, applied to a number of classic algorithms in this area. The methodology focuses on balance and semantic transparency to enable comparison of human and algorithmic output. Although the Incremental Algorithm emerges as the best match, we found that its dependency on manually-set parameters makes its performance difficult to predict.
Gatt et al. (Mon,) studied this question.