Key points are not available for this paper at this time.
Issues pertaining to the quality of performance assessments are discussed. Traditional concepts of reliability and validity are important to performance tasks in that they help to establish the contexts in which such measures can be appropriately used and to create caveats for interpretation of results. Examples, both historical and contemporary, show a remarkable degree of consistency in the characteristics of data from human judgments of performance, data that bear directly on matters of trustworthiness and correctness of inferences from samples of complex performance. In particular, direct assessments of complex performance do not typically generalize from one task to another and thus require careful sampling of tasks to secure an acceptable degree of score reliability and validity for most uses. These observations suggest the pressing need for greater quality control in the design and execution of performance assessments. If such assessments are to have lasting effects on instruction and learning, then their technical properties must be understood and appreciated by developer and practitioner alike.
Dunbar et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: