January 1, 1992Open Access

The statistical significance of the MUC-4 results

Key Points

Key points are not available for this paper at this time.

Abstract

The MUC-4 scores of recall, precision, and the F-measures are used to measure the performance of the participating systems. The differences in the scores between any two systems may be due to chance or may be due to a significant difference between the two systems. To rule out the possibility that the difference is due to chance, statistical hypothesis testing is used. The method of hypothesis testing used is a computationally-intensive method known as approximate randomization. The method and the statistical significance of the results for the two MUC-4 test sets, TST3 and TST4, will be discussed in this paper.

KI fragen

Bookmark

View Full Paper

Cite This Study

Nancy Chinchor (Wed,) studied this question.

synapsesocial.com/papers/6a1c1953b33628da419d34a1 https://doi.org/https://doi.org/10.3115/1072064.1072068