Key points are not available for this paper at this time.
Researchers assessing interrater agreement for ratings of a single target have increasingly used the rWG(j) index, but have found it can display irregular behavior. Mathematical analyses show this problem arises from the use of random response, operationalized by the variance of a uniform distribution (sEU), for the baseline of comparison. These analyses suggest that researchers should continue to use rWG(j) as a summary measure of interrater agreement, but should use maximum dissensus as a reference distribution for computing rWG(j). Although values of s2 can be descriptively misleading, they provide an important inferential baseline. Thus, sEU should be used in computing x 2 tests of the departure of the observed response variance from random responding. Researchers should also examine interrater agreement as a theoretical variable in its own right, investigating the causes and consequences of rater dissensus.
Lindell et al. (Mon,) studied this question.