Key points are not available for this paper at this time.
This study examined how well current software implementations of four polytomous item response theory models fit several multiple-choice tests. The models were Bock's (1972) nominal model, Samejima's (1979) multiple-choice Model C, Thissen Williams cross-validation samples of approximately 3,000 were used to evaluate goodness of fit. Both fit plots and X2 statistics were used to determine the adequacy of fit. Bock's model provided surprisingly good fit; adding parameters to the nominal model did not yield improvements in fit. FORSCORE provided generally good fit for Levine's nonparametric model across all tests. Index terms: Bock's nominal model, FORSCORE, maximum likelihood formula scoring, MULTILOG, polytomous IRT.
Drasgow et al. (Thu,) studied this question.