Key points are not available for this paper at this time.
evaluation measures produce a ranking of all possible extract summaries of a document., Recall-based evaluation measures, which depend on costly human-generated ground truth summaries, produce uncorrelated rankings when ground truth is varied. This paper proposes using sentence-rankbased and content-based measures for evaluating extract summaries, and compares these with recallbased evaluation measures. Content-based measures increase the correlation of rankings induced by synonymous ground truths, and exhibit other desirable properties.
Donaway et al. (Sat,) studied this question.